Market-Mix-Modelling: Budget_Optimization

Objective

ElecKart is an e-commerce firm based out of Ontario, Canada specialising in electronic products. Over the last one year, they had spent a significant amount of money on marketing. Occasionally, they had also offered big-ticket promotions (similar to the Big Billion Day). They are about to create a marketing budget for the next year, which includes spending on commercials, online campaigns, and pricing & promotion strategies. The CFO feels that the money spent over the last 12 months on marketing was not sufficiently impactful, and, that they can either cut on the budget or reallocate it optimally across marketing levers to improve the revenue response.

Imagine that you are a part of the marketing team working on budget optimisation. You need to develop a market mix model to observe the actual impact of different marketing variables over the last year. Using your understanding of the model, you have to recommend the optimal budget allocation for different marketing levers for the next year.*

The objective of this project is to create a market mix model for 3 product sub-categories - Camera accessory, Gaming accessory and Home Audio - to observe the actual impact of various marketing variables over one year (July 2015 to June 2016) and recommend the optimal budget allocation for different marketing levers for the next year.

Step 1: Importing import packages

In [1]:
# import libraries
import pandas as pd
import numpy as np

# For Visualisation
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
import matplotlib

# Supress Warnings
import warnings
warnings.filterwarnings('ignore')

# Pandas Settings
# pd.options.display.float_format = '{:.1f}'.format
pd.set_option('display.max_rows', 40000)
pd.set_option('display.max_columns', 500)
pd.set_option('display.width', 1000)

# Miscellaneous imports
from datetime import datetime
from scipy.stats import norm
import re

Step 2: Reading Data

Reading the main data file - ConsumerElectronics.csv into a dataframe
In [2]:
main_df = pd.read_csv('ConsumerElectronics.csv')

main_df.head()
Out[2]:
fsn_id order_date Year Month order_id order_item_id gmv units deliverybdays deliverycdays s1_fact.order_payment_type sla cust_id pincode product_analytic_super_category product_analytic_category product_analytic_sub_category product_analytic_vertical product_mrp product_procurement_sla
0 ACCCX3S58G7B5F6P 2015-10-17 15:11:54 2015 10 3.419301e+15 3.419301e+15 6400 1 \N \N COD 5 -1.01299130778588E+018 -7.79175582905735E+018 CE CameraAccessory CameraAccessory CameraTripod 7190 0
1 ACCCX3S58G7B5F6P 2015-10-19 10:07:22 2015 10 1.420831e+15 1.420831e+15 6900 1 \N \N COD 7 -8.99032457905512E+018 7.33541149097431E+018 CE CameraAccessory CameraAccessory CameraTripod 7190 0
2 ACCCX3S5AHMF55FV 2015-10-20 15:45:56 2015 10 2.421913e+15 2.421913e+15 1990 1 \N \N COD 10 -1.0404429420466E+018 -7.47768776228657E+018 CE CameraAccessory CameraAccessory CameraTripod 2099 3
3 ACCCX3S5AHMF55FV 2015-10-14 12:05:15 2015 10 4.416592e+15 4.416592e+15 1690 1 \N \N Prepaid 4 -7.60496084352714E+018 -5.83593163877661E+018 CE CameraAccessory CameraAccessory CameraTripod 2099 3
4 ACCCX3S5AHMF55FV 2015-10-17 21:25:03 2015 10 4.419525e+15 4.419525e+15 1618 1 \N \N Prepaid 6 2.8945572083453E+018 5.34735360997242E+017 CE CameraAccessory CameraAccessory CameraTripod 2099 3
Understanding some attributes

SKU => Stock Keeping Unit

deliverybdays => days to get item or order from warehouse for shipping

deliverycdays => days to deliver item to customer

Reading the product list excel data file into a dataframe
In [3]:
product_list = pd.read_excel('Media data and other information.xlsx', sheet_name='Product List')

product_list.columns.values[1]='product_analytic_vertical'

product_list.drop(product_list.columns[0], axis=1, inplace = True)

product_list.head()
Out[3]:
product_analytic_vertical Frequency Percent
0 \N 5828 0.353464
1 AmplifierReceiver 4056 0.245994
2 AudioMP3Player 112892 6.846819
3 Binoculars 14599 0.885419
4 BoomBox 2879 0.174609
Reading the Media Investment excel data file into a dataframe
In [4]:
media_investment = pd.read_excel('Media data and other information.xlsx', sheet_name='Media Investment', skiprows=2)

media_investment.drop(media_investment.columns[0], axis=1, inplace = True)

media_investment.head()
Out[4]:
Year Month Total Investment TV Digital Sponsorship Content Marketing Online marketing Affiliates SEM Radio Other
0 2015 7 17.061775 0.215330 2.533014 7.414270 0.000933 1.327278 0.547254 5.023697 NaN NaN
1 2015 8 5.064306 0.006438 1.278074 1.063332 0.000006 0.129244 0.073684 2.513528 NaN NaN
2 2015 9 96.254380 3.879504 1.356528 62.787651 0.610292 16.379990 5.038266 6.202149 NaN NaN
3 2015 10 170.156297 6.144711 12.622480 84.672532 3.444075 24.371778 6.973711 31.927011 NaN NaN
4 2015 11 51.216220 4.220630 1.275469 14.172116 0.168633 19.561574 6.595767 5.222032 NaN NaN
Reading the Sale Calendar excel data file into a dataframe
In [5]:
sale_calendar = pd.read_excel('Media data and other information.xlsx', sheet_name='Special Sale Calendar', \
                              skiprows=0, skipfooter=2)

sale_calendar.drop(sale_calendar.columns[0], axis=1, inplace = True)

sale_calendar.iloc[1:6, 0] = sale_calendar.iloc[0, 0]
sale_calendar.iloc[7:, 0] = sale_calendar.iloc[6, 0]

sale_calendar
Out[5]:
Unnamed: 1 Sales Calendar
0 2015.0 (18-19th July)
1 2015.0 (15-17th Aug)
2 2015.0 (28-30th Aug)
3 2015.0 (17-15th Oct)
4 2015.0 (7-14th Nov)
5 2015.0 (25th Dec'15 - 3rd Jan'16)
6 2016.0 (20-22 Jan)
7 2016.0 (1-2 Feb)
8 2016.0 (20-21 Feb)
9 2016.0 (14-15 Feb)
10 2016.0 (7-9 Mar)
11 2016.0 (25-27 May)
Reading the net_promoter_score excel file into a dataframe
In [6]:
net_promoter_score = pd.read_excel('Media data and other information.xlsx', sheet_name='Monthly NPS Score', \
                              skiprows=0)

net_promoter_score.columns.values[0]='score'

net_promoter_score
Out[6]:
score July'15 Aug'15 Sept'15 Oct'15 Nov'15 Dec'15 Jan'16 Feb'16 Mar'16 Apr'16 May'16 June'16
0 NPS 54.599588 59.987101 46.925419 44.398389 47.0 45.8 47.093031 50.327406 49.02055 51.827605 47.306951 50.516687
1 Stock Index 1177.000000 1206.000000 1101.000000 1210.000000 1233.0 1038.0 1052.000000 1222.000000 1015.00000 1242.000000 1228.000000 1194.000000
Reading the product details word file into a dataframe
In [7]:
# pip install --pre python-docx
import pandas as pd
import io
import csv
from docx import Document

def read_docx_tables(filename, tab_id=None, **kwargs):
    """
    parse table(s) from a Word Document (.docx) into Pandas DataFrame(s)

    Parameters:
        filename:   file name of a Word Document

        tab_id:     parse a single table with the index: [tab_id] (counting from 0).
                    When [None] - return a list of DataFrames (parse all tables)

        kwargs:     arguments to pass to `pd.read_csv()` function

    Return: a single DataFrame if tab_id != None or a list of DataFrames otherwise
    """
    def read_docx_tab(tab, **kwargs):
        vf = io.StringIO()
        writer = csv.writer(vf)
        for row in tab.rows:
            writer.writerow(cell.text for cell in row.cells)
        vf.seek(0)
        return pd.read_csv(vf, **kwargs)

    doc = Document(filename)
    if tab_id is None:
        return [read_docx_tab(tab, **kwargs) for tab in doc.tables]
    else:
        try:
            return read_docx_tab(doc.tables[tab_id], **kwargs)
        except IndexError:
            print('Error: specified [tab_id]: {}  does not exist.'.format(tab_id))
            raise

dfs = read_docx_tables('Product Details.docx')

dfs[0]
Out[7]:
super_category category sub_category vertical
0 CE Camera Camera Camcorders
1 CE Camera Camera DSLR
2 CE Camera Camera Instant Cameras
3 CE Camera Camera Point & Shoot
4 CE Camera Camera SportsAndAction
5 CE CameraAccessory CameraAccessory Binoculars
6 CE CameraAccessory CameraAccessory CameraAccessory
7 CE CameraAccessory CameraAccessory CameraBag
8 CE CameraAccessory CameraAccessory CameraBattery
9 CE CameraAccessory CameraAccessory CameraBatteryCharger
10 CE CameraAccessory CameraAccessory CameraBatteryGrip
11 CE CameraAccessory CameraAccessory CameraEyeCup
12 CE CameraAccessory CameraAccessory CameraFilmRolls
13 CE CameraAccessory CameraAccessory CameraHousing
14 CE CameraAccessory CameraAccessory CameraLEDLight
15 CE CameraAccessory CameraAccessory CameraMicrophone
16 CE CameraAccessory CameraAccessory CameraMount
17 CE CameraAccessory CameraAccessory CameraRemoteControl
18 CE CameraAccessory CameraAccessory CameraTripod
19 CE CameraAccessory CameraAccessory ExtensionTube
20 CE CameraAccessory CameraAccessory Filter
21 CE CameraAccessory CameraAccessory Flash
22 CE CameraAccessory CameraAccessory FlashShoeAdapter
23 CE CameraAccessory CameraAccessory Lens
24 CE CameraAccessory CameraAccessory ReflectorUmbrella
25 CE CameraAccessory CameraAccessory Softbox
26 CE CameraAccessory CameraAccessory Strap
27 CE CameraAccessory CameraAccessory Teleconverter
28 CE CameraAccessory CameraAccessory Telescope
29 CE CameraAccessory CameraStorage CameraStorageMemoryCard
30 CE EntertainmentSmall AmplifierReceiver AmplifierReceiver
31 CE EntertainmentSmall AudioAccessory Microphone
32 CE EntertainmentSmall AudioAccessory MicrophoneAccessory
33 CE EntertainmentSmall AudioMP3Player AudioMP3Player
34 CE EntertainmentSmall HomeAudio BoomBox
35 CE EntertainmentSmall HomeAudio DJController
36 CE EntertainmentSmall HomeAudio Dock
37 CE EntertainmentSmall HomeAudio DockingStation
38 CE EntertainmentSmall HomeAudio FMRadio
39 CE EntertainmentSmall HomeAudio HiFiSystem
40 CE EntertainmentSmall HomeAudio HomeAudioSpeaker
41 CE EntertainmentSmall HomeAudio KaraokePlayer
42 CE EntertainmentSmall HomeAudio SlingBox
43 CE EntertainmentSmall HomeAudio SoundMixer
44 CE EntertainmentSmall HomeAudio VoiceRecorder
45 CE EntertainmentSmall HomeTheatre HomeTheatre
46 CE EntertainmentSmall Speaker \N
47 CE EntertainmentSmall Speaker LaptopSpeaker
48 CE EntertainmentSmall Speaker MobileSpeaker
49 CE EntertainmentSmall TVVideoSmall RemoteControl
50 CE EntertainmentSmall TVVideoSmall SelectorBox
51 CE EntertainmentSmall TVVideoSmall VideoGlasses
52 CE EntertainmentSmall TVVideoSmall VideoPlayer
53 CE GameCDDVD Game CodeInTheBoxGame
54 CE GameCDDVD Game PhysicalGame
55 CE GameCDDVD GameMembershipCards GameValueCards
56 CE GamingHardware GamingAccessory CoolingPad
57 CE GamingHardware GamingAccessory GameControlMount
58 CE GamingHardware GamingAccessory GamePad
59 CE GamingHardware GamingAccessory GamingAccessoryKit
60 CE GamingHardware GamingAccessory GamingAdapter
61 CE GamingHardware GamingAccessory GamingChargingStation
62 CE GamingHardware GamingAccessory GamingGun
63 CE GamingHardware GamingAccessory GamingHeadset
64 CE GamingHardware GamingAccessory GamingKeyboard
65 CE GamingHardware GamingAccessory GamingMemoryCard
66 CE GamingHardware GamingAccessory GamingMouse
67 CE GamingHardware GamingAccessory GamingMousePad
68 CE GamingHardware GamingAccessory GamingSpeaker
69 CE GamingHardware GamingAccessory JoystickGamingWheel
70 CE GamingHardware GamingAccessory MotionController
71 CE GamingHardware GamingAccessory TVOutCableAccessory
72 CE GamingHardware GamingConsole GamingConsole
73 CE GamingHardware GamingConsole HandheldGamingConsole
In [8]:
# Dropping the extra row(s)/column(s)

product_details = pd.DataFrame(dfs[0])

product_details.drop(product_details.columns[0], axis=1, inplace = True)

product_details.head()
Out[8]:
category sub_category vertical
0 Camera Camera Camcorders
1 Camera Camera DSLR
2 Camera Camera Instant Cameras
3 Camera Camera Point & Shoot
4 Camera Camera SportsAndAction

Step 3: Cleaning Data

In [9]:
main_df.head()
Out[9]:
fsn_id order_date Year Month order_id order_item_id gmv units deliverybdays deliverycdays s1_fact.order_payment_type sla cust_id pincode product_analytic_super_category product_analytic_category product_analytic_sub_category product_analytic_vertical product_mrp product_procurement_sla
0 ACCCX3S58G7B5F6P 2015-10-17 15:11:54 2015 10 3.419301e+15 3.419301e+15 6400 1 \N \N COD 5 -1.01299130778588E+018 -7.79175582905735E+018 CE CameraAccessory CameraAccessory CameraTripod 7190 0
1 ACCCX3S58G7B5F6P 2015-10-19 10:07:22 2015 10 1.420831e+15 1.420831e+15 6900 1 \N \N COD 7 -8.99032457905512E+018 7.33541149097431E+018 CE CameraAccessory CameraAccessory CameraTripod 7190 0
2 ACCCX3S5AHMF55FV 2015-10-20 15:45:56 2015 10 2.421913e+15 2.421913e+15 1990 1 \N \N COD 10 -1.0404429420466E+018 -7.47768776228657E+018 CE CameraAccessory CameraAccessory CameraTripod 2099 3
3 ACCCX3S5AHMF55FV 2015-10-14 12:05:15 2015 10 4.416592e+15 4.416592e+15 1690 1 \N \N Prepaid 4 -7.60496084352714E+018 -5.83593163877661E+018 CE CameraAccessory CameraAccessory CameraTripod 2099 3
4 ACCCX3S5AHMF55FV 2015-10-17 21:25:03 2015 10 4.419525e+15 4.419525e+15 1618 1 \N \N Prepaid 6 2.8945572083453E+018 5.34735360997242E+017 CE CameraAccessory CameraAccessory CameraTripod 2099 3
In [10]:
main_df.dtypes
Out[10]:
fsn_id                              object
order_date                          object
Year                                 int64
Month                                int64
order_id                           float64
order_item_id                      float64
gmv                                 object
units                                int64
deliverybdays                       object
deliverycdays                       object
s1_fact.order_payment_type          object
sla                                  int64
cust_id                             object
pincode                             object
product_analytic_super_category     object
product_analytic_category           object
product_analytic_sub_category       object
product_analytic_vertical           object
product_mrp                          int64
product_procurement_sla              int64
dtype: object

Correcting Data Types

In [11]:
# String to datetime

main_df['order_date'] =  pd.to_datetime(main_df['order_date'], format='%Y-%m-%d %H:%M:%S')
In [12]:
# Int to string

main_df[['order_id','order_item_id']] = main_df[['order_id','order_item_id']].astype(object)
In [13]:
# Int to string

main_df[['Year','Month']] = main_df[['Year','Month']].astype(str)
Assuming "\N" value in deliverybdays & deliverycdays is equal to 0, will impute the nulls created by it with 0
In [14]:
main_df['deliverybdays'] = pd.to_numeric(main_df['deliverybdays'], errors='coerce')
main_df['deliverybdays'].fillna(value=0, inplace=True)

main_df['deliverycdays'] = pd.to_numeric(main_df['deliverycdays'], errors='coerce')
main_df['deliverycdays'].fillna(value=0, inplace=True)
In [15]:
# String to int

main_df['gmv'] = pd.to_numeric(main_df['gmv'], errors='coerce')
In [16]:
main_df.dtypes
Out[16]:
fsn_id                                     object
order_date                         datetime64[ns]
Year                                       object
Month                                      object
order_id                                   object
order_item_id                              object
gmv                                       float64
units                                       int64
deliverybdays                             float64
deliverycdays                             float64
s1_fact.order_payment_type                 object
sla                                         int64
cust_id                                    object
pincode                                    object
product_analytic_super_category            object
product_analytic_category                  object
product_analytic_sub_category              object
product_analytic_vertical                  object
product_mrp                                 int64
product_procurement_sla                     int64
dtype: object

Unique Values

In [17]:
# Unique value frequencies

unique_values = pd.DataFrame(main_df.apply(lambda x: len(x.value_counts(dropna=False)), axis=0), columns=['Unique Value Count']).sort_values(by='Unique Value Count', ascending=True)

unique_values['dtype'] = pd.DataFrame(main_df.dtypes)

unique_values
Out[17]:
Unique Value Count dtype
product_analytic_super_category 1 object
Year 2 object
s1_fact.order_payment_type 2 object
product_analytic_category 5 object
Month 12 object
product_analytic_sub_category 14 object
product_procurement_sla 17 int64
units 27 int64
sla 60 int64
product_analytic_vertical 74 object
deliverybdays 142 float64
deliverycdays 170 float64
product_mrp 1929 int64
gmv 12524 float64
pincode 12973 object
fsn_id 21219 object
order_date 1155192 datetime64[ns]
cust_id 1253495 object
order_item_id 1480765 object
order_id 1501177 object
In [18]:
initial_shape = main_df.shape
initial_shape
Out[18]:
(1648824, 20)

Fix Invalid Values

Treating incorrect GMV values w.r.t product_mrp * units

In [19]:
# Instances where GMV values are greater than MRP * units which is incorrect

print(main_df.loc[main_df['product_mrp'] * main_df['units'] < main_df['gmv']].shape[0])

print(round(100*(main_df.loc[main_df['product_mrp'] * main_df['units'] < main_df['gmv']].shape[0] / main_df.shape[0]), 2))
38569
2.34

There are 38569 records (2.34%) in the dataframe where the GMV value is greater than the MRP * units.

We will be imputing the faulty mrp values with gmv/units
In [20]:
# update column based on another column
main_df.product_mrp = np.where(main_df['product_mrp'] < main_df['gmv'] / main_df['units'], \
                                   main_df['gmv'] / main_df['units'], main_df['product_mrp'])

main_df.shape
Out[20]:
(1648824, 20)

Reinspecting instances where GMV value is greater than the MRP * units.

In [21]:
# Instances where GMV values are greater than MRP * units which is incorrect

print(main_df.loc[main_df['product_mrp'] * main_df['units'] < main_df['gmv']].shape[0])

print(round(100*(main_df.loc[main_df['product_mrp'] * main_df['units'] < main_df['gmv']].shape[0] / main_df.shape[0]), 2))
0
0.0

All erroneous instances were removed

Treating GMV values less than 0

In [22]:
print(main_df.loc[main_df['gmv'] < 0].shape[0])

print(round(100*(main_df.loc[main_df['gmv'] < 0].shape[0]/main_df.shape[0]), 2))
0
0.0

Treating MRP values less than or equal to 0

In [23]:
print(main_df.loc[main_df['product_mrp'] < 0].shape[0])

print(round(100*(main_df.loc[main_df['product_mrp'] < 0].shape[0]/main_df.shape[0]), 2))
0
0.0

Treating Units values less than or equal to 0

In [24]:
print(main_df.loc[main_df['units'] <= 0].shape[0])

print(round(100*(main_df.loc[main_df['units'] <= 0].shape[0]/main_df.shape[0]), 2))
0
0.0

No erroneous rows left in MRP, GMV or Units columns

Handling Negative values for deliverybdays & deliverycdays

In [25]:
print(main_df.loc[main_df['deliverybdays'] < 0].shape[0])
print(round(100*(main_df.loc[main_df['deliverybdays'] < 0].shape[0]/main_df.shape[0]),4))

print(main_df.loc[main_df['deliverycdays'] < 0].shape[0])
print(round(100*(main_df.loc[main_df['deliverycdays'] < 0].shape[0]/main_df.shape[0]),4))
38
0.0023
39
0.0024
  • There are 38 records(0.0023%) in the dataframe with negative values for deliverybdays.
  • There are 39 records(0.0024%) in the dataframe with negative values for deliverycdays.

We will be dropping such rows since it is not possible for a product to have negative values for Dispatch Delay from Warehouse and to customer.

In [26]:
main_df = main_df.loc[(main_df['deliverybdays'] >= 0) & (main_df['deliverycdays'] >= 0)]

main_df.reset_index(drop=True, inplace=True)
In [27]:
main_df.shape
Out[27]:
(1648785, 20)

Handling Negative values for product_procurement_sla

In [28]:
print(main_df.loc[main_df['product_procurement_sla'] < 0].shape[0])

print(round(100*(main_df.loc[main_df['product_procurement_sla'] < 0].shape[0]/main_df.shape[0]),4))
75986
4.6086
  • There are 75986 records(4.61%) in the dataframe with negative values for product_procurement_sla.

We will be dropping such rows since it is not possible for a product to have negative values for time typically taken to procure it.

In [29]:
main_df = main_df.loc[(main_df['product_procurement_sla'] >= 0)]

main_df.reset_index(drop=True, inplace=True)
In [30]:
main_df.shape
Out[30]:
(1572799, 20)

Handling large values for product_procurement_sla

In [31]:
main_df[['product_procurement_sla']].describe().T
Out[31]:
count mean std min 25% 50% 75% max
product_procurement_sla 1572799.0 5.712887 54.724168 0.0 2.0 2.0 3.0 1000.0
In [32]:
# Slightly alter the figure size to make it more horizontal.
plt.figure(figsize=(6, 4), dpi=100, facecolor='w', edgecolor='k', frameon='True')
sns.set_style("whitegrid") # white/whitegrid/dark/ticks
sns.set_context("paper") # talk/poster

main_df.product_procurement_sla.hist()

# Automatically adjust subplot params so that the subplotS fits in to the figure area.
plt.tight_layout()

# display the plot
plt.show()
In [33]:
main_df.product_procurement_sla.value_counts()
Out[33]:
2       528406
1       308504
3       303349
5       222097
4       120190
0        42459
6        18559
7        11746
15        6184
14        5281
1000      4745
13         526
8          523
9           97
10          73
12          60
Name: product_procurement_sla, dtype: int64
In [34]:
print(main_df.loc[main_df['product_procurement_sla'] >= 1000].shape[0])

print(round(100*(main_df.loc[main_df['product_procurement_sla'] >= 1000].shape[0]/main_df.shape[0]),4))
4745
0.3017
  • There are 4745 records(0.3%) in the dataframe with unusually large values of product_procurement_sla.

We will be dropping such rows since it is not possible for a product to have product_procurement_sla more than 1000

Retaining rows where the sla is less than 1000

In [35]:
#Retaining rows where the sla is less than 1000

main_df = main_df.loc[(main_df['product_procurement_sla'] < 1000)]
main_df.head()
Out[35]:
fsn_id order_date Year Month order_id order_item_id gmv units deliverybdays deliverycdays s1_fact.order_payment_type sla cust_id pincode product_analytic_super_category product_analytic_category product_analytic_sub_category product_analytic_vertical product_mrp product_procurement_sla
0 ACCCX3S58G7B5F6P 2015-10-17 15:11:54 2015 10 3.4193e+15 3.4193e+15 6400.0 1 0.0 0.0 COD 5 -1.01299130778588E+018 -7.79175582905735E+018 CE CameraAccessory CameraAccessory CameraTripod 7190.0 0
1 ACCCX3S58G7B5F6P 2015-10-19 10:07:22 2015 10 1.42083e+15 1.42083e+15 6900.0 1 0.0 0.0 COD 7 -8.99032457905512E+018 7.33541149097431E+018 CE CameraAccessory CameraAccessory CameraTripod 7190.0 0
2 ACCCX3S5AHMF55FV 2015-10-20 15:45:56 2015 10 2.42191e+15 2.42191e+15 1990.0 1 0.0 0.0 COD 10 -1.0404429420466E+018 -7.47768776228657E+018 CE CameraAccessory CameraAccessory CameraTripod 2099.0 3
3 ACCCX3S5AHMF55FV 2015-10-14 12:05:15 2015 10 4.41659e+15 4.41659e+15 1690.0 1 0.0 0.0 Prepaid 4 -7.60496084352714E+018 -5.83593163877661E+018 CE CameraAccessory CameraAccessory CameraTripod 2099.0 3
4 ACCCX3S5AHMF55FV 2015-10-17 21:25:03 2015 10 4.41953e+15 4.41953e+15 1618.0 1 0.0 0.0 Prepaid 6 2.8945572083453E+018 5.34735360997242E+017 CE CameraAccessory CameraAccessory CameraTripod 2099.0 3
In [36]:
main_df.shape
Out[36]:
(1568054, 20)
In [37]:
main_df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 1568054 entries, 0 to 1572798
Data columns (total 20 columns):
fsn_id                             1568054 non-null object
order_date                         1568054 non-null datetime64[ns]
Year                               1568054 non-null object
Month                              1568054 non-null object
order_id                           1568054 non-null object
order_item_id                      1568054 non-null object
gmv                                1563535 non-null float64
units                              1568054 non-null int64
deliverybdays                      1568054 non-null float64
deliverycdays                      1568054 non-null float64
s1_fact.order_payment_type         1568054 non-null object
sla                                1568054 non-null int64
cust_id                            1568054 non-null object
pincode                            1568054 non-null object
product_analytic_super_category    1568054 non-null object
product_analytic_category          1568054 non-null object
product_analytic_sub_category      1568054 non-null object
product_analytic_vertical          1568054 non-null object
product_mrp                        1568054 non-null float64
product_procurement_sla            1568054 non-null int64
dtypes: datetime64[ns](1), float64(4), int64(3), object(12)
memory usage: 251.2+ MB

De-Duplicate Data

Make all string columns lower case
In [38]:
cat_cols = [cname for cname in main_df.columns if main_df[cname].dtype == "object"]

cat_cols
Out[38]:
['fsn_id',
 'Year',
 'Month',
 'order_id',
 'order_item_id',
 's1_fact.order_payment_type',
 'cust_id',
 'pincode',
 'product_analytic_super_category',
 'product_analytic_category',
 'product_analytic_sub_category',
 'product_analytic_vertical']
In [39]:
# Filtering the string columns

cat_cols = ['fsn_id',
 's1_fact.order_payment_type',
 'product_analytic_super_category',
 'product_analytic_category',
 'product_analytic_sub_category',
 'product_analytic_vertical']

for col in cat_cols:
    main_df[col] = main_df[col].str.lower()
Spelling check by investigating the unique values
In [40]:
cat_cols = [cname for cname in main_df.columns if main_df[cname].dtype == "object"]

main_df[cat_cols].describe()
Out[40]:
fsn_id Year Month order_id order_item_id s1_fact.order_payment_type cust_id pincode product_analytic_super_category product_analytic_category product_analytic_sub_category product_analytic_vertical
count 1568054 1568054 1568054 1.568054e+06 1.568054e+06 1568054 1568054 1.568054e+06 1568054 1568054 1568054 1568054
unique 20620 2 12 1.430145e+06 1.410408e+06 2 1199714 1.288000e+04 1 5 14 73
top seldxmc3jnmtqmgv 2016 9 1.560047e+15 3.000373e+17 cod -9.031043e+18 ce entertainmentsmall speaker mobilespeaker
freq 17859 835309 198112 3.400000e+01 3.800000e+01 1130205 4519 7.519000e+03 1568054 897157 501816 249170
In [41]:
# Checking only those columns whose cardinality is less

selected_cat_cols = ['s1_fact.order_payment_type', 'product_analytic_super_category', 'product_analytic_category', 'product_analytic_sub_category', 'product_analytic_vertical']

for col in selected_cat_cols:
    print('\n################################')
    print('Unique values of ' + str(col))
    print('################################')
    print(pd.Series(main_df[col].unique()).sort_values(ascending=False))
################################
Unique values of s1_fact.order_payment_type
################################
1    prepaid
0        cod
dtype: object

################################
Unique values of product_analytic_super_category
################################
0    ce
dtype: object

################################
Unique values of product_analytic_category
################################
1        gaminghardware
3             gamecddvd
2    entertainmentsmall
0       cameraaccessory
4                camera
dtype: object

################################
Unique values of product_analytic_sub_category
################################
12           tvvideosmall
2                 speaker
10            hometheatre
3               homeaudio
9           gamingconsole
1         gamingaccessory
13    gamemembershipcards
7                    game
4           camerastorage
0         cameraaccessory
8                  camera
6          audiomp3player
11         audioaccessory
5       amplifierreceiver
dtype: object

################################
Unique values of product_analytic_vertical
################################
59              voicerecorder
60                videoplayer
58               videoglasses
12        tvoutcableaccessory
57                  telescope
71              teleconverter
23                      strap
37            sportsandaction
64                 soundmixer
69                    softbox
27                   slingbox
56                selectorbox
55              remotecontrol
70          reflectorumbrella
35              point & shoot
32               physicalgame
9            motioncontroller
17              mobilespeaker
53        microphoneaccessory
54                 microphone
1                        lens
15              laptopspeaker
52              karaokeplayer
13        joystickgamingwheel
38            instant cameras
51                hometheatre
16           homeaudiospeaker
50                 hifisystem
49      handheldgamingconsole
28              gamingspeaker
21             gamingmousepad
10                gamingmouse
14           gamingmemorycard
24             gamingkeyboard
8               gamingheadset
48              gamingconsole
61      gamingchargingstation
11              gamingadapter
7          gamingaccessorykit
68             gamevaluecards
6                     gamepad
63           gamecontrolmount
47                    fmradio
67           flashshoeadapter
3                       flash
20                     filter
46              extensiontube
34                       dslr
44             dockingstation
26                       dock
45               djcontroller
29                 coolingpad
42           codeintheboxgame
0                cameratripod
22    camerastoragememorycard
18        cameraremotecontrol
43                cameramount
66           cameramicrophone
65             cameraledlight
72              camerahousing
41            camerafilmrolls
62               cameraeyecup
39          camerabatterygrip
5        camerabatterycharger
4               camerabattery
2                   camerabag
40            cameraaccessory
36                 camcorders
33                    boombox
19                 binoculars
31             audiomp3player
30          amplifierreceiver
25                         \n
dtype: object

There are no duplicates due to spelling mistakes

Drop duplicate rows
In [42]:
main_df.shape
Out[42]:
(1568054, 20)
In [43]:
print(main_df.duplicated().value_counts()[1])

print(round(100*(main_df.duplicated().value_counts()[1]/main_df.shape[0]),4))
99283
6.3316
Around 99283 (6.33%) rows have duplicates. We will go ahead and drop them.
In [44]:
main_df.drop_duplicates(keep='first', inplace=True)
main_df.shape
Out[44]:
(1468771, 20)
In [45]:
main_df.duplicated().value_counts()
Out[45]:
False    1468771
dtype: int64

No more duplicates

Treating Nulls

In [46]:
# Checking for total count and percentage of null values in all columns of the dataframe.

total = pd.DataFrame(main_df.isnull().sum().sort_values(ascending=False), columns=['Total'])
percentage = pd.DataFrame(round(100*(main_df.isnull().sum()/main_df.shape[0]),2).sort_values(ascending=False)\
                          ,columns=['Percentage'])

pd.concat([total, percentage], axis = 1).head()
Out[46]:
Total Percentage
gmv 3705 0.25
product_procurement_sla 0 0.00
product_mrp 0 0.00
order_date 0 0.00
Year 0 0.00

There are a few null values in the gmv column.

Also, there are quite a few Whitespaces present in some of the columns in the dataframe.

Let us try to first convert these white spaces to Nans and then we would treat them accordingly.

In [47]:
main_df.replace(' ', np.nan, inplace = True)

Again trying to find null percentage in columns

In [48]:
# Checking for total count and percentage of null values in all columns of the dataframe.

total = pd.DataFrame(main_df.isnull().sum().sort_values(ascending=False), columns=['Total'])
percentage = pd.DataFrame(round(100*(main_df.isnull().sum()/main_df.shape[0]),2).sort_values(ascending=False)\
                          ,columns=['Percentage'])
pd.concat([total, percentage], axis = 1).head()
Out[48]:
Total Percentage
pincode 3705 0.25
cust_id 3705 0.25
gmv 3705 0.25
product_procurement_sla 0 0.00
deliverybdays 0 0.00

Removing rows where a particular column has high missing values

In [49]:
# removing rows where a particular column has high missing values because the column cannot be removed because of its importance
main_df = main_df[~pd.isnull(main_df['gmv'])]
org_shape = main_df.shape
org_shape
Out[49]:
(1465066, 20)

Again trying to find null percentage in columns

In [50]:
# Checking for total count and percentage of null values in all columns of the dataframe.

total = pd.DataFrame(main_df.isnull().sum().sort_values(ascending=False), columns=['Total'])
percentage = pd.DataFrame(round(100*(main_df.isnull().sum()/main_df.shape[0]),2).sort_values(ascending=False)\
                          ,columns=['Percentage'])
pd.concat([total, percentage], axis = 1).head()
Out[50]:
Total Percentage
product_procurement_sla 0 0.0
product_mrp 0 0.0
order_date 0 0.0
Year 0 0.0
Month 0 0.0

Finally no Null rows exist in the dataframe

Selecting one Yr Data

As per business requirement, we have to use the data from July 2015 to June 2016.

Changing the datatype of the column 'order_date' to datetime64

In [51]:
main_df.dtypes
Out[51]:
fsn_id                                     object
order_date                         datetime64[ns]
Year                                       object
Month                                      object
order_id                                  float64
order_item_id                             float64
gmv                                       float64
units                                       int64
deliverybdays                             float64
deliverycdays                             float64
s1_fact.order_payment_type                 object
sla                                         int64
cust_id                                    object
pincode                                    object
product_analytic_super_category            object
product_analytic_category                  object
product_analytic_sub_category              object
product_analytic_vertical                  object
product_mrp                               float64
product_procurement_sla                     int64
dtype: object

Identifying the data outside our analysis period - 01-Jul-2015 to 30-Jun-2016

In [52]:
main_df.loc[(main_df['order_date'].dt.floor("d") < '2015-07-01') | (main_df['order_date'].dt.floor("d") >= '2016-07-01')].shape
Out[52]:
(592, 20)

There are 592 records that fall outside our analysis period - 01-Jul-2015 to 30-Jun-2016

We will be dropping these records.

In [53]:
main_df = main_df.loc[(main_df['order_date'].dt.floor("d") >= '2015-07-01') & (main_df['order_date'].dt.floor("d") < '2016-07-01')]
main_df.shape
Out[53]:
(1464474, 20)
In [54]:
Max = pd.DataFrame(main_df[['order_date']].max().rename('Max'))
Min = pd.DataFrame(main_df[['order_date']].min().rename('Min'))

pd.concat([Min, Max], axis=1)
Out[54]:
Min Max
order_date 2015-07-01 00:36:11 2016-06-30 23:59:26

This verifies that our dataframe has data between July 2015 and June 2016 only.

Generate Week Column

In [55]:
# Creating a new column into a specific position in a DataFrame
loc_index = main_df.columns.get_loc('Month') + 1
main_df.insert(loc=loc_index,column='Week',value=main_df['order_date'].dt.strftime("%V"))

main_df['Year'] = main_df['Year'].astype('str')
main_df.head()
Out[55]:
fsn_id order_date Year Month Week order_id order_item_id gmv units deliverybdays deliverycdays s1_fact.order_payment_type sla cust_id pincode product_analytic_super_category product_analytic_category product_analytic_sub_category product_analytic_vertical product_mrp product_procurement_sla
0 acccx3s58g7b5f6p 2015-10-17 15:11:54 2015 10 42 3.419301e+15 3.419301e+15 6400.0 1 0.0 0.0 cod 5 -1.01299130778588E+018 -7.79175582905735E+018 ce cameraaccessory cameraaccessory cameratripod 7190.0 0
1 acccx3s58g7b5f6p 2015-10-19 10:07:22 2015 10 43 1.420831e+15 1.420831e+15 6900.0 1 0.0 0.0 cod 7 -8.99032457905512E+018 7.33541149097431E+018 ce cameraaccessory cameraaccessory cameratripod 7190.0 0
2 acccx3s5ahmf55fv 2015-10-20 15:45:56 2015 10 43 2.421913e+15 2.421913e+15 1990.0 1 0.0 0.0 cod 10 -1.0404429420466E+018 -7.47768776228657E+018 ce cameraaccessory cameraaccessory cameratripod 2099.0 3
3 acccx3s5ahmf55fv 2015-10-14 12:05:15 2015 10 42 4.416592e+15 4.416592e+15 1690.0 1 0.0 0.0 prepaid 4 -7.60496084352714E+018 -5.83593163877661E+018 ce cameraaccessory cameraaccessory cameratripod 2099.0 3
4 acccx3s5ahmf55fv 2015-10-17 21:25:03 2015 10 42 4.419525e+15 4.419525e+15 1618.0 1 0.0 0.0 prepaid 6 2.8945572083453E+018 5.34735360997242E+017 ce cameraaccessory cameraaccessory cameratripod 2099.0 3
In [56]:
# Checking the combinations for any discrepancies

main_df.groupby(['Year', 'Week']).agg({'Month':"count"}).reset_index(drop=False)
Out[56]:
Year Week Month
0 2015 27 284
1 2015 28 22168
2 2015 29 22480
3 2015 30 23461
4 2015 31 15256
5 2015 32 26
6 2015 33 11
7 2015 34 8
8 2015 35 12
9 2015 36 21699
10 2015 37 22862
11 2015 38 20627
12 2015 39 22974
13 2015 40 22280
14 2015 41 19620
15 2015 42 106456
16 2015 43 22940
17 2015 44 30343
18 2015 45 34121
19 2015 46 29492
20 2015 47 20920
21 2015 48 22926
22 2015 49 25409
23 2015 50 34954
24 2015 51 26785
25 2015 52 44972
26 2015 53 16742
27 2016 01 32388
28 2016 02 26514
29 2016 03 39103
30 2016 04 28014
31 2016 05 32148
32 2016 06 29921
33 2016 07 38557
34 2016 08 35146
35 2016 09 33053
36 2016 10 45205
37 2016 11 31427
38 2016 12 29084
39 2016 13 29738
40 2016 14 30099
41 2016 15 24874
42 2016 16 14761
43 2016 17 52832
44 2016 18 31270
45 2016 19 33503
46 2016 20 32760
47 2016 21 43469
48 2016 22 31134
49 2016 23 30529
50 2016 24 28267
51 2016 25 24315
52 2016 26 13362
53 2016 53 13173
In [57]:
# Updating the year as 2015 for the week whose week# is 53 but belongs to 2016 for consistency with the other data sets

# Updating the month as 12 for the above rows for consistency

# Dropping rows (less in number) with week# 27 as it belongs to the weeks in June 2015

main_df.loc[(main_df.Year == '2016') & (main_df.Week == '53'), 'Year'] = '2015'

main_df.loc[(main_df.Year == '2015') & (main_df.Week == '53'), 'Month'] = 12

main_df.drop(main_df[main_df['Week'] == '27'].index, inplace = True)
In [58]:
main_df.groupby(['Year', 'Week']).agg({'Month':"count"}).reset_index(drop=False)
Out[58]:
Year Week Month
0 2015 28 22168
1 2015 29 22480
2 2015 30 23461
3 2015 31 15256
4 2015 32 26
5 2015 33 11
6 2015 34 8
7 2015 35 12
8 2015 36 21699
9 2015 37 22862
10 2015 38 20627
11 2015 39 22974
12 2015 40 22280
13 2015 41 19620
14 2015 42 106456
15 2015 43 22940
16 2015 44 30343
17 2015 45 34121
18 2015 46 29492
19 2015 47 20920
20 2015 48 22926
21 2015 49 25409
22 2015 50 34954
23 2015 51 26785
24 2015 52 44972
25 2015 53 29915
26 2016 01 32388
27 2016 02 26514
28 2016 03 39103
29 2016 04 28014
30 2016 05 32148
31 2016 06 29921
32 2016 07 38557
33 2016 08 35146
34 2016 09 33053
35 2016 10 45205
36 2016 11 31427
37 2016 12 29084
38 2016 13 29738
39 2016 14 30099
40 2016 15 24874
41 2016 16 14761
42 2016 17 52832
43 2016 18 31270
44 2016 19 33503
45 2016 20 32760
46 2016 21 43469
47 2016 22 31134
48 2016 23 30529
49 2016 24 28267
50 2016 25 24315
51 2016 26 13362

Drop Insignificant Columns

In [59]:
main_df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 1464190 entries, 0 to 1473147
Data columns (total 21 columns):
fsn_id                             1464190 non-null object
order_date                         1464190 non-null datetime64[ns]
Year                               1464190 non-null object
Month                              1464190 non-null object
Week                               1464190 non-null object
order_id                           1464190 non-null float64
order_item_id                      1464190 non-null float64
gmv                                1464190 non-null float64
units                              1464190 non-null int64
deliverybdays                      1464190 non-null float64
deliverycdays                      1464190 non-null float64
s1_fact.order_payment_type         1464190 non-null object
sla                                1464190 non-null int64
cust_id                            1464190 non-null object
pincode                            1464190 non-null object
product_analytic_super_category    1464190 non-null object
product_analytic_category          1464190 non-null object
product_analytic_sub_category      1464190 non-null object
product_analytic_vertical          1464190 non-null object
product_mrp                        1464190 non-null float64
product_procurement_sla            1464190 non-null int64
dtypes: datetime64[ns](1), float64(6), int64(3), object(11)
memory usage: 245.8+ MB
In [60]:
# Dropping Columns with Single Value or all Different Values

count_df = pd.DataFrame(main_df.apply(lambda x: len(x.value_counts()), axis=0), columns=['Count'])

drop_columns = list(count_df.loc[(count_df['Count']==1) | (count_df['Count']==len(main_df.index))].index)

print('Dropping these columns => {}'.format(drop_columns))

main_df.drop(drop_columns, axis=1, inplace=True)
Dropping these columns => ['product_analytic_super_category']
In [61]:
# Dropping Columns which are insignificant to the analysis

drop_columns = ['fsn_id', 'order_id', 'order_item_id', 'cust_id']

main_df.drop(drop_columns, axis=1, inplace=True)

Feature Engineering

In [62]:
main_df.head()
Out[62]:
order_date Year Month Week gmv units deliverybdays deliverycdays s1_fact.order_payment_type sla pincode product_analytic_category product_analytic_sub_category product_analytic_vertical product_mrp product_procurement_sla
0 2015-10-17 15:11:54 2015 10 42 6400.0 1 0.0 0.0 cod 5 -7.79175582905735E+018 cameraaccessory cameraaccessory cameratripod 7190.0 0
1 2015-10-19 10:07:22 2015 10 43 6900.0 1 0.0 0.0 cod 7 7.33541149097431E+018 cameraaccessory cameraaccessory cameratripod 7190.0 0
2 2015-10-20 15:45:56 2015 10 43 1990.0 1 0.0 0.0 cod 10 -7.47768776228657E+018 cameraaccessory cameraaccessory cameratripod 2099.0 3
3 2015-10-14 12:05:15 2015 10 42 1690.0 1 0.0 0.0 prepaid 4 -5.83593163877661E+018 cameraaccessory cameraaccessory cameratripod 2099.0 3
4 2015-10-17 21:25:03 2015 10 42 1618.0 1 0.0 0.0 prepaid 6 5.34735360997242E+017 cameraaccessory cameraaccessory cameratripod 2099.0 3

Creating column List Price

List Price = GMV/Units

In [63]:
# Creating a new column into a specific position in a DataFrame
loc_index = main_df.columns.get_loc('gmv') + 1
main_df.insert(loc=loc_index,column='list_price',value = main_df['gmv'] / main_df['units'])
main_df.head()
Out[63]:
order_date Year Month Week gmv list_price units deliverybdays deliverycdays s1_fact.order_payment_type sla pincode product_analytic_category product_analytic_sub_category product_analytic_vertical product_mrp product_procurement_sla
0 2015-10-17 15:11:54 2015 10 42 6400.0 6400.0 1 0.0 0.0 cod 5 -7.79175582905735E+018 cameraaccessory cameraaccessory cameratripod 7190.0 0
1 2015-10-19 10:07:22 2015 10 43 6900.0 6900.0 1 0.0 0.0 cod 7 7.33541149097431E+018 cameraaccessory cameraaccessory cameratripod 7190.0 0
2 2015-10-20 15:45:56 2015 10 43 1990.0 1990.0 1 0.0 0.0 cod 10 -7.47768776228657E+018 cameraaccessory cameraaccessory cameratripod 2099.0 3
3 2015-10-14 12:05:15 2015 10 42 1690.0 1690.0 1 0.0 0.0 prepaid 4 -5.83593163877661E+018 cameraaccessory cameraaccessory cameratripod 2099.0 3
4 2015-10-17 21:25:03 2015 10 42 1618.0 1618.0 1 0.0 0.0 prepaid 6 5.34735360997242E+017 cameraaccessory cameraaccessory cameratripod 2099.0 3

Displaying the column values for a few orders when units > 1

In [64]:
main_df.loc[main_df['units'] != 1][['gmv','list_price','product_mrp','units']].head()
Out[64]:
gmv list_price product_mrp units
559 13560.0 6780.0 8950.0 2
667 940.0 470.0 1545.0 2
669 940.0 470.0 1545.0 2
671 940.0 470.0 1545.0 2
674 940.0 470.0 1545.0 2

Creating a Payday Flag (+- 1 for salary days)

If it is nearer to the salary day in Ontario(1st and 15th of every month), we flag the column as 1, else as 0

In [65]:
main_df['payday_flag'] = main_df['order_date'].apply(lambda x:1 if x.strftime('%d') in ('14','15','16','30','31','1','2') else 0)
main_df.head()
Out[65]:
order_date Year Month Week gmv list_price units deliverybdays deliverycdays s1_fact.order_payment_type sla pincode product_analytic_category product_analytic_sub_category product_analytic_vertical product_mrp product_procurement_sla payday_flag
0 2015-10-17 15:11:54 2015 10 42 6400.0 6400.0 1 0.0 0.0 cod 5 -7.79175582905735E+018 cameraaccessory cameraaccessory cameratripod 7190.0 0 0
1 2015-10-19 10:07:22 2015 10 43 6900.0 6900.0 1 0.0 0.0 cod 7 7.33541149097431E+018 cameraaccessory cameraaccessory cameratripod 7190.0 0 0
2 2015-10-20 15:45:56 2015 10 43 1990.0 1990.0 1 0.0 0.0 cod 10 -7.47768776228657E+018 cameraaccessory cameraaccessory cameratripod 2099.0 3 0
3 2015-10-14 12:05:15 2015 10 42 1690.0 1690.0 1 0.0 0.0 prepaid 4 -5.83593163877661E+018 cameraaccessory cameraaccessory cameratripod 2099.0 3 1
4 2015-10-17 21:25:03 2015 10 42 1618.0 1618.0 1 0.0 0.0 prepaid 6 5.34735360997242E+017 cameraaccessory cameraaccessory cameratripod 2099.0 3 0

Creating an Occassion Flag

If it is a Holiday/Occassion in Ontario, we flag the column as 1, else as 0

The following table lists all the holidays/occassions from 1st July, 2015 to 30th June, 2016**

Occassion Day
Canada Day July 1, 2015
Civic Holiday August 3, 2015
Labour Day September 7, 2015
Thanksgiving October 12, 2015
Halloween October 31, 2015
Remembrance Day November 11, 2015
Christmas Day December 25, 2015
Boxing Day December 26, 2015
New Year's Day January 1, 2016
Islander Day February 15, 2016
Louis Riel Day February 15, 2016
Heritage Day February 15, 2016
Family Day February 15, 2016
Valentine's Day February 14, 2016
Leap Day February 29, 2016
St. Patrick's Day March 17, 2016
Good Friday March 25, 2016
Easter Monday March 28, 2016
Mother's Day May 8, 2016
Victoria Day May 23, 2016
Father's Day June 19, 2016
Aboriginal Day June 21, 2016
St. Jean Baptiste Day June 24, 2016

**reference link: https://www.statutoryholidays.com/

In [66]:
def holidayflg(ord_date):
    if ord_date.strftime('%Y') == 2015:
        if ord_date.strftime('%m') == '07' and ord_date.strftime('%d') == '01':
            return 1
        elif ord_date.strftime('%m') == '08' and ord_date.strftime('%d') == '03':
            return 1
        elif ord_date.strftime('%m') == '09' and ord_date.strftime('%d') == '07':
            return 1
        elif ord_date.strftime('%m') == '10' and ord_date.strftime('%d') == '12':
            return 1
        elif ord_date.strftime('%m') == '10' and ord_date.strftime('%d') == '31':
            return 1
        elif ord_date.strftime('%m') == '11' and ord_date.strftime('%d') == '11':
            return 1
        elif ord_date.strftime('%m') == '12' and ord_date.strftime('%d') == '25':
            return 1
        elif ord_date.strftime('%m') == '12' and ord_date.strftime('%d') == '26':
            return 1
        else:
            return 0
    else:
        if ord_date.strftime('%m') == '01' and ord_date.strftime('%d') == '01':
            return 1
        elif ord_date.strftime('%m') == '02' and ord_date.strftime('%d') == '15':
            return 1
        elif ord_date.strftime('%m') == '02' and ord_date.strftime('%d') == '14':
            return 1
        elif ord_date.strftime('%m') == '02' and ord_date.strftime('%d') =='29':
            return 1
        elif ord_date.strftime('%m') == '03' and ord_date.strftime('%d') == '17':
            return 1
        elif ord_date.strftime('%m') == '03' and ord_date.strftime('%d') == '25':
            return 1
        elif ord_date.strftime('%m') == '03' and ord_date.strftime('%d') == '28':
            return 1
        elif ord_date.strftime('%m') == '05' and ord_date.strftime('%d') == '08':
            return 1
        elif ord_date.strftime('%m') == '05' and ord_date.strftime('%d') == '23':
            return 1
        elif ord_date.strftime('%m') == '06' and ord_date.strftime('%d') == '19':
            return 1
        elif ord_date.strftime('%m') == '06' and ord_date.strftime('%d') == '21':
            return 1
        elif ord_date.strftime('%m') == '06' and ord_date.strftime('%d') == '24':
            return 1
        else:
            return 0
In [67]:
main_df['occassion_flag'] = main_df['order_date'].apply(lambda x:holidayflg(x))
main_df.head()
Out[67]:
order_date Year Month Week gmv list_price units deliverybdays deliverycdays s1_fact.order_payment_type sla pincode product_analytic_category product_analytic_sub_category product_analytic_vertical product_mrp product_procurement_sla payday_flag occassion_flag
0 2015-10-17 15:11:54 2015 10 42 6400.0 6400.0 1 0.0 0.0 cod 5 -7.79175582905735E+018 cameraaccessory cameraaccessory cameratripod 7190.0 0 0 0
1 2015-10-19 10:07:22 2015 10 43 6900.0 6900.0 1 0.0 0.0 cod 7 7.33541149097431E+018 cameraaccessory cameraaccessory cameratripod 7190.0 0 0 0
2 2015-10-20 15:45:56 2015 10 43 1990.0 1990.0 1 0.0 0.0 cod 10 -7.47768776228657E+018 cameraaccessory cameraaccessory cameratripod 2099.0 3 0 0
3 2015-10-14 12:05:15 2015 10 42 1690.0 1690.0 1 0.0 0.0 prepaid 4 -5.83593163877661E+018 cameraaccessory cameraaccessory cameratripod 2099.0 3 1 0
4 2015-10-17 21:25:03 2015 10 42 1618.0 1618.0 1 0.0 0.0 prepaid 6 5.34735360997242E+017 cameraaccessory cameraaccessory cameratripod 2099.0 3 0 0

Creating a column called Product Type - Luxury / Mass_market

If GMV value is greater than 80 percentile, then luxury, else mass_market

In [68]:
pd.DataFrame(main_df['gmv']).describe(percentiles=[.70,.80,.90]).T
Out[68]:
count mean std min 50% 70% 80% 90% max
gmv 1464190.0 2483.958105 5622.334895 0.0 790.0 1599.0 2450.0 4300.0 226947.0
In [69]:
main_df['gmv'].quantile(.8)
Out[69]:
2450.0
In [70]:
main_df['product_type'] = main_df['gmv'].apply(lambda x:'luxury' if x >= 2450 else 'mass_market')
main_df.head()
Out[70]:
order_date Year Month Week gmv list_price units deliverybdays deliverycdays s1_fact.order_payment_type sla pincode product_analytic_category product_analytic_sub_category product_analytic_vertical product_mrp product_procurement_sla payday_flag occassion_flag product_type
0 2015-10-17 15:11:54 2015 10 42 6400.0 6400.0 1 0.0 0.0 cod 5 -7.79175582905735E+018 cameraaccessory cameraaccessory cameratripod 7190.0 0 0 0 luxury
1 2015-10-19 10:07:22 2015 10 43 6900.0 6900.0 1 0.0 0.0 cod 7 7.33541149097431E+018 cameraaccessory cameraaccessory cameratripod 7190.0 0 0 0 luxury
2 2015-10-20 15:45:56 2015 10 43 1990.0 1990.0 1 0.0 0.0 cod 10 -7.47768776228657E+018 cameraaccessory cameraaccessory cameratripod 2099.0 3 0 0 mass_market
3 2015-10-14 12:05:15 2015 10 42 1690.0 1690.0 1 0.0 0.0 prepaid 4 -5.83593163877661E+018 cameraaccessory cameraaccessory cameratripod 2099.0 3 1 0 mass_market
4 2015-10-17 21:25:03 2015 10 42 1618.0 1618.0 1 0.0 0.0 prepaid 6 5.34735360997242E+017 cameraaccessory cameraaccessory cameratripod 2099.0 3 0 0 mass_market

Calculating Discount %

Discount = (product_mrp - list_price) / product_mrp

In [71]:
# Creating a new column into a specific position in a DataFrame
col_loc = main_df.columns.get_loc('list_price') + 1
main_df.insert(loc=col_loc, column='Discount%', value = \
               round(100*((main_df['product_mrp'] - main_df['list_price']) / main_df['product_mrp']),2))
main_df.head()
Out[71]:
order_date Year Month Week gmv list_price Discount% units deliverybdays deliverycdays s1_fact.order_payment_type sla pincode product_analytic_category product_analytic_sub_category product_analytic_vertical product_mrp product_procurement_sla payday_flag occassion_flag product_type
0 2015-10-17 15:11:54 2015 10 42 6400.0 6400.0 10.99 1 0.0 0.0 cod 5 -7.79175582905735E+018 cameraaccessory cameraaccessory cameratripod 7190.0 0 0 0 luxury
1 2015-10-19 10:07:22 2015 10 43 6900.0 6900.0 4.03 1 0.0 0.0 cod 7 7.33541149097431E+018 cameraaccessory cameraaccessory cameratripod 7190.0 0 0 0 luxury
2 2015-10-20 15:45:56 2015 10 43 1990.0 1990.0 5.19 1 0.0 0.0 cod 10 -7.47768776228657E+018 cameraaccessory cameraaccessory cameratripod 2099.0 3 0 0 mass_market
3 2015-10-14 12:05:15 2015 10 42 1690.0 1690.0 19.49 1 0.0 0.0 prepaid 4 -5.83593163877661E+018 cameraaccessory cameraaccessory cameratripod 2099.0 3 1 0 mass_market
4 2015-10-17 21:25:03 2015 10 42 1618.0 1618.0 22.92 1 0.0 0.0 prepaid 6 5.34735360997242E+017 cameraaccessory cameraaccessory cameratripod 2099.0 3 0 0 mass_market

Statistical Info

Numeric columns
In [72]:
main_df.describe().T
Out[72]:
count mean std min 25% 50% 75% max
gmv 1464190.0 2483.958105 5622.334895 0.0 349.00 790.00 1999.00 226947.0
list_price 1464190.0 2450.951701 5541.148673 0.0 349.00 775.00 1999.00 226947.0
Discount% 1464190.0 44.547432 22.777238 0.0 27.94 45.05 61.18 100.0
units 1464190.0 1.021544 0.255668 1.0 1.00 1.00 1.00 50.0
deliverybdays 1464190.0 1.032111 2.477507 0.0 0.00 0.00 0.00 238.0
deliverycdays 1464190.0 1.202033 2.884773 0.0 0.00 0.00 0.00 278.0
sla 1464190.0 5.760836 2.993450 0.0 4.00 6.00 7.00 1006.0
product_mrp 1464190.0 4223.001376 8653.348732 49.0 849.00 1699.00 3499.00 299999.0
product_procurement_sla 1464190.0 2.701303 1.786134 0.0 2.00 2.00 3.00 15.0
payday_flag 1464190.0 0.170539 0.376105 0.0 0.00 0.00 0.00 1.0
occassion_flag 1464190.0 0.033711 0.180484 0.0 0.00 0.00 0.00 1.0
Distribution of the numeric columns
In [73]:
for col in main_df.describe().columns:
    print('#############')
    print(col)
    print('#############')
    main_df[col].hist()
    plt.show()
#############
gmv
#############
#############
list_price
#############
#############
Discount%
#############
#############
units
#############
#############
deliverybdays
#############
#############
deliverycdays
#############
#############
sla
#############
#############
product_mrp
#############
#############
product_procurement_sla
#############
#############
payday_flag
#############
#############
occassion_flag
#############
Categorical columns
In [74]:
cat_var = [cname for cname in main_df.columns if 
                                main_df[cname].dtype == "object"]

main_df[cat_var].describe().T
Out[74]:
count unique top freq
Year 1464190 2 2016 821473
Month 1464190 13 10 187636
Week 1464190 52 42 106456
s1_fact.order_payment_type 1464190 2 cod 1053934
pincode 1.46419e+06 12877 -9.03104e+18 7512
product_analytic_category 1464190 5 entertainmentsmall 834380
product_analytic_sub_category 1464190 14 speaker 468310
product_analytic_vertical 1464190 73 mobilespeaker 235677
product_type 1464190 2 mass_market 1171351

Outlier Treatment

In [75]:
main_df.head()
Out[75]:
order_date Year Month Week gmv list_price Discount% units deliverybdays deliverycdays s1_fact.order_payment_type sla pincode product_analytic_category product_analytic_sub_category product_analytic_vertical product_mrp product_procurement_sla payday_flag occassion_flag product_type
0 2015-10-17 15:11:54 2015 10 42 6400.0 6400.0 10.99 1 0.0 0.0 cod 5 -7.79175582905735E+018 cameraaccessory cameraaccessory cameratripod 7190.0 0 0 0 luxury
1 2015-10-19 10:07:22 2015 10 43 6900.0 6900.0 4.03 1 0.0 0.0 cod 7 7.33541149097431E+018 cameraaccessory cameraaccessory cameratripod 7190.0 0 0 0 luxury
2 2015-10-20 15:45:56 2015 10 43 1990.0 1990.0 5.19 1 0.0 0.0 cod 10 -7.47768776228657E+018 cameraaccessory cameraaccessory cameratripod 2099.0 3 0 0 mass_market
3 2015-10-14 12:05:15 2015 10 42 1690.0 1690.0 19.49 1 0.0 0.0 prepaid 4 -5.83593163877661E+018 cameraaccessory cameraaccessory cameratripod 2099.0 3 1 0 mass_market
4 2015-10-17 21:25:03 2015 10 42 1618.0 1618.0 22.92 1 0.0 0.0 prepaid 6 5.34735360997242E+017 cameraaccessory cameraaccessory cameratripod 2099.0 3 0 0 mass_market
In [76]:
main_df.dtypes
Out[76]:
order_date                       datetime64[ns]
Year                                     object
Month                                    object
Week                                     object
gmv                                     float64
list_price                              float64
Discount%                               float64
units                                     int64
deliverybdays                           float64
deliverycdays                           float64
s1_fact.order_payment_type               object
sla                                       int64
pincode                                  object
product_analytic_category                object
product_analytic_sub_category            object
product_analytic_vertical                object
product_mrp                             float64
product_procurement_sla                   int64
payday_flag                               int64
occassion_flag                            int64
product_type                             object
dtype: object
In [77]:
# Function to plot the distribution plot of the numeric variable list

numeric_variables=['gmv','list_price','Discount%','deliverybdays','deliverycdays','sla','product_mrp','product_procurement_sla']

#Function to plot the distribution plot of the numeric variable list
def univariate_continuos(var_list):
    plt.figure(figsize=(12,6))
    for var in var_list:
        plt.subplot(2,4,var_list.index(var)+1)
        #plt.boxplot(country[var])
        sns.boxplot(y=var,palette='cubehelix', data=main_df)
    # Automatically adjust subplot params so that the subplotS fits in to the figure area.
    plt.tight_layout()
    # display the plot
    plt.show()
In [78]:
univariate_continuos(numeric_variables)

There seems to be a lot of outliers in the columns. But all outliers may not be at random and we will have to be careful regarding which ones to drop.

In [79]:
# Checking outliers at 25%,50%,75%,90%,95% and 99%

main_df[numeric_variables].describe(percentiles=[.25,.5,.75,.90,.95,.99]).T
Out[79]:
count mean std min 25% 50% 75% 90% 95% 99% max
gmv 1464190.0 2483.958105 5622.334895 0.0 349.00 790.00 1999.00 4300.00 11170.00 30499.0 226947.0
list_price 1464190.0 2450.951701 5541.148673 0.0 349.00 775.00 1999.00 4299.00 10990.00 29746.0 226947.0
Discount% 1464190.0 44.547432 22.777238 0.0 27.94 45.05 61.18 74.93 81.32 90.4 100.0
deliverybdays 1464190.0 1.032111 2.477507 0.0 0.00 0.00 0.00 5.00 6.00 8.0 238.0
deliverycdays 1464190.0 1.202033 2.884773 0.0 0.00 0.00 0.00 5.00 7.00 10.0 278.0
sla 1464190.0 5.760836 2.993450 0.0 4.00 6.00 7.00 9.00 10.00 13.0 1006.0
product_mrp 1464190.0 4223.001376 8653.348732 49.0 849.00 1699.00 3499.00 7150.00 17995.00 45990.0 299999.0
product_procurement_sla 1464190.0 2.701303 1.786134 0.0 2.00 2.00 3.00 5.00 5.00 7.0 15.0

After analyzing the distribution of values of the above attributes at different percentiles, we come to the following conclusion:

  • 'gmv', 'list_price' and 'mrp' values can have occassional outliers owing to the fact that there are seldom some high value sales. These records are less frequent and not at all at random.
  • 'Discount%' can also be more in case on certain products. Specifically, a discount percentage of 100% would mena, the product came free with some other product.
  • On ther hand, unusually high values of 'deliverybdays', 'deliverycdays' and 'sla' seem to be erroneous data. So we will find a way to treat these outlier values so that they do not affect the predictive model while at the same time there will be enough data to build a generalizable model.
In [80]:
main_df[['deliverybdays', 'deliverycdays' ,'sla']].quantile([.95, .99])
Out[80]:
deliverybdays deliverycdays sla
0.95 6.0 7.0 10.0
0.99 8.0 10.0 13.0

Let us assume any observation above 99 percentile for a variable is considered as an outlier for that variable.

Percentage of Outliers in column deliverybdays

In [81]:
print(round(100*(main_df.loc[main_df['deliverybdays'] > 8.0].shape[0] / main_df.shape[0]),4))
0.8753

Percentage of Outliers in column deliverycdays

In [82]:
print(round(100*(main_df.loc[main_df['deliverycdays'] > 10.0].shape[0] / main_df.shape[0]),4))
0.6466

Percentage of Outliers in column sla

In [83]:
print(round(100*(main_df.loc[main_df['sla'] > 13.0].shape[0] / main_df.shape[0]),4))
0.8393

Capping value above or below a certain percentile:

For the variables - 'SLA', 'deliverybdays', 'deliverybdays', 'gmv', 'product_mrp', 'list_price' where outliers are present, we will CAP the values above 99 percentile to the value corresponding to 99 percentile.

In [84]:
# Updating the outlier values with values corresponding to pre-decided percentiles
main_df['deliverybdays'][np.abs(main_df['deliverybdays'] > 8.0)]= 8.0
main_df['deliverybdays'][np.abs(main_df['deliverybdays'] > 10.0)]= 10.0
main_df['sla'][np.abs(main_df['sla'] > 13.0)]= 13.0
In [85]:
print(round(100*(main_df.shape[0] / initial_shape[0])))
89

So 89% records have been retained after outlier treatment

In [1]:
# Checking outliers at 25%,50%,75%,90%,95% and 99%

main_df.describe(percentiles=[.25,.5,.75,.90,.95,.99]).T
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-1-e5d857fdc137> in <module>()
      1 # Checking outliers at 25%,50%,75%,90%,95% and 99%
      2 
----> 3 main_df.describe(percentiles=[.25,.5,.75,.90,.95,.99]).T

NameError: name 'main_df' is not defined
In [87]:
univariate_continuos(numeric_variables)

Skewness

In [88]:
# Checking the skewness in the dataset

main_df[numeric_variables].skew()
Out[88]:
gmv                         5.177873
list_price                  5.161353
Discount%                  -0.063059
deliverybdays               1.944739
deliverycdays              10.297208
sla                         0.356847
product_mrp                 4.612160
product_procurement_sla     2.490341
dtype: float64
In [89]:
num_col = ['gmv','deliverybdays','deliverycdays','list_price','product_mrp','product_procurement_sla']

#Function to plot the distribution plot of the numeric variable list
def distplots(var_list):
    plt.figure(figsize=(15,6))
    for var in var_list:
        plt.subplot(2,3,var_list.index(var)+1)
        sns.distplot(main_df[var], fit=norm, kde=False, hist=False)
        #sns.boxplot(y=var,palette='cubehelix', data=main_df)
    # Automatically adjust subplot params so that the subplotS fits in to the figure area.
    plt.tight_layout()
    # display the plot
    plt.show()
    
distplots(num_col)  
In [90]:
main_df.isnull().values.any()
Out[90]:
False

Mapping months to number of weeks for the month

Below function takes input an year and generates the numbers of the weeks for each month in it

In [91]:
import calendar

def WeekFinderFromYear(year):
        """ will return all the week from selected year """

        import datetime

        WEEK = {'MONDAY':0,'TUESDAY':1,'WEDNESDAY':2,'THURSDAY':3,'FRIDAY':4,'SATURDAY':5,'SUNDAY':6}
        MONTH = {'JANUARY':1,'FEBRUARY':2,'MARCH':3,'APRIL':4,'MAY':5,'JUNE':6,'JULY':7,'AUGUST':8,'SEPTEMBER':9,'OCTOBER':10,'NOVEMBER':11,'DECEMBER':12}

        year = int(year)
        month = MONTH['JANUARY']
        day = WEEK['MONDAY']

        dt = datetime.date(year, month, 1)
        dow_lst = []

        while dt.weekday() != day:
            dt = dt + datetime.timedelta(days = 1)

        lst_month = MONTH.values()

        for mont in lst_month:
         while dt.month == mont:
            dow_lst.append(dt)
            dt = dt + datetime.timedelta(days=7)
            
        month_number_week = {1:[], 2:[], 3:[], 4:[], 5:[], 6:[], 7:[], 8:[], 9:[], 10:[], 11:[], 12:[]}
        month_name_week = {'Jan':[], 'Feb':[], 'Mar':[], 'Apr':[], 'May':[], 'Jun':[], 'Jul':[], 'Aug':[], 'Sep':[], 'Oct':[], 'Nov':[], 'Dec':[]}

        for each in dow_lst:
            month_number_week[each.month].append(each.isocalendar()[1])
            month_name_week[calendar.month_abbr[each.month]].append(each.isocalendar()[1])

        return month_number_week, month_name_week
In [92]:
dict_number_2015, dict_name_2015 = WeekFinderFromYear(2015)
print(dict_number_2015)
print()
print(dict_name_2015)

print()

dict_number_2016, dict_name_2016 = WeekFinderFromYear(2016)
print(dict_number_2016)
print()
print(dict_name_2016)
{1: [2, 3, 4, 5], 2: [6, 7, 8, 9], 3: [10, 11, 12, 13, 14], 4: [15, 16, 17, 18], 5: [19, 20, 21, 22], 6: [23, 24, 25, 26, 27], 7: [28, 29, 30, 31], 8: [32, 33, 34, 35, 36], 9: [37, 38, 39, 40], 10: [41, 42, 43, 44], 11: [45, 46, 47, 48, 49], 12: [50, 51, 52, 53]}

{'Jan': [2, 3, 4, 5], 'Feb': [6, 7, 8, 9], 'Mar': [10, 11, 12, 13, 14], 'Apr': [15, 16, 17, 18], 'May': [19, 20, 21, 22], 'Jun': [23, 24, 25, 26, 27], 'Jul': [28, 29, 30, 31], 'Aug': [32, 33, 34, 35, 36], 'Sep': [37, 38, 39, 40], 'Oct': [41, 42, 43, 44], 'Nov': [45, 46, 47, 48, 49], 'Dec': [50, 51, 52, 53]}

{1: [1, 2, 3, 4], 2: [5, 6, 7, 8, 9], 3: [10, 11, 12, 13], 4: [14, 15, 16, 17], 5: [18, 19, 20, 21, 22], 6: [23, 24, 25, 26], 7: [27, 28, 29, 30], 8: [31, 32, 33, 34, 35], 9: [36, 37, 38, 39], 10: [40, 41, 42, 43, 44], 11: [45, 46, 47, 48], 12: [49, 50, 51, 52]}

{'Jan': [1, 2, 3, 4], 'Feb': [5, 6, 7, 8, 9], 'Mar': [10, 11, 12, 13], 'Apr': [14, 15, 16, 17], 'May': [18, 19, 20, 21, 22], 'Jun': [23, 24, 25, 26], 'Jul': [27, 28, 29, 30], 'Aug': [31, 32, 33, 34, 35], 'Sep': [36, 37, 38, 39], 'Oct': [40, 41, 42, 43, 44], 'Nov': [45, 46, 47, 48], 'Dec': [49, 50, 51, 52]}
In [93]:
num_weeks_2015 = 0
num_weeks_2016 = 0

for i in dict_number_2015:
    if i >= 7:
        num_weeks_2015 += len(dict_number_2015[i])

for i in dict_number_2016:
    if i <= 6:
        num_weeks_2016 += len(dict_number_2016[i])

total_weeks = num_weeks_2015 + num_weeks_2016
total_weeks
Out[93]:
52

As expected there are a total of 52 weeks in our dataset

Media Investment

We will generate weekly data from Year and Month for the media investment data

In [94]:
media_investment.head()
Out[94]:
Year Month Total Investment TV Digital Sponsorship Content Marketing Online marketing Affiliates SEM Radio Other
0 2015 7 17.061775 0.215330 2.533014 7.414270 0.000933 1.327278 0.547254 5.023697 NaN NaN
1 2015 8 5.064306 0.006438 1.278074 1.063332 0.000006 0.129244 0.073684 2.513528 NaN NaN
2 2015 9 96.254380 3.879504 1.356528 62.787651 0.610292 16.379990 5.038266 6.202149 NaN NaN
3 2015 10 170.156297 6.144711 12.622480 84.672532 3.444075 24.371778 6.973711 31.927011 NaN NaN
4 2015 11 51.216220 4.220630 1.275469 14.172116 0.168633 19.561574 6.595767 5.222032 NaN NaN
In [95]:
# Imputing nulls with 0

media_investment['Radio'].fillna(value=0, inplace=True)
media_investment['Other'].fillna(value=0, inplace=True)

media_investment.head()
Out[95]:
Year Month Total Investment TV Digital Sponsorship Content Marketing Online marketing Affiliates SEM Radio Other
0 2015 7 17.061775 0.215330 2.533014 7.414270 0.000933 1.327278 0.547254 5.023697 0.0 0.0
1 2015 8 5.064306 0.006438 1.278074 1.063332 0.000006 0.129244 0.073684 2.513528 0.0 0.0
2 2015 9 96.254380 3.879504 1.356528 62.787651 0.610292 16.379990 5.038266 6.202149 0.0 0.0
3 2015 10 170.156297 6.144711 12.622480 84.672532 3.444075 24.371778 6.973711 31.927011 0.0 0.0
4 2015 11 51.216220 4.220630 1.275469 14.172116 0.168633 19.561574 6.595767 5.222032 0.0 0.0
In [96]:
# Temp DataFrame

temp_media_investment = pd.DataFrame(index=range(total_weeks), columns=['Year', 'Month', 'Week', 'Total Investment', 'TV', 'Digital', 'Sponsorship', 'Content Marketing', 'Online marketing', 'Affiliates', 'SEM', 'Radio', 'Other'])

temp_media_investment.head()
Out[96]:
Year Month Week Total Investment TV Digital Sponsorship Content Marketing Online marketing Affiliates SEM Radio Other
0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
4 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

We divide the monthly values by the number of weeks in that month and take that as weekly data

In [97]:
i = 0

for index, row in media_investment.iterrows():

    if row.Year == 2015:

        num_weeks = len(dict_number_2015[row.Month])

        for week in dict_number_2015[row.Month]:
            
            temp_media_investment.iloc[i].Year = row.Year
            temp_media_investment.iloc[i].Month = row.Month
            temp_media_investment.iloc[i]['Week'] = week
            temp_media_investment.iloc[i]['Total Investment'] = round(row['Total Investment'] / num_weeks, 3)
            temp_media_investment.iloc[i]['TV'] = round(row['TV'] / num_weeks, 3)
            temp_media_investment.iloc[i]['Digital'] = round(row['Digital'] / num_weeks, 3)
            temp_media_investment.iloc[i]['Sponsorship'] = round(row['Sponsorship'] / num_weeks, 3)
            temp_media_investment.iloc[i]['Content Marketing'] = round(row['Content Marketing'] / num_weeks, 3)
            temp_media_investment.iloc[i]['Online marketing'] = round(row['Online marketing'] / num_weeks, 3)
            temp_media_investment.iloc[i]['Affiliates'] = round(row[' Affiliates'] / num_weeks, 3)
            temp_media_investment.iloc[i]['SEM'] = round(row['SEM'] / num_weeks, 3)
            temp_media_investment.iloc[i]['Radio'] = round(row['Radio'] / num_weeks, 3)
            temp_media_investment.iloc[i]['Other'] = round(row['Other'] / num_weeks, 3)
            
            i+=1

    elif row.Year == 2016:
        
        num_weeks = len(dict_number_2016[row.Month])

        for week in dict_number_2016[row.Month]:
            
            temp_media_investment.iloc[i].Year = row.Year
            temp_media_investment.iloc[i].Month = row.Month
            temp_media_investment.iloc[i]['Week'] = week
            temp_media_investment.iloc[i]['Total Investment'] = round(row['Total Investment'] / num_weeks, 3)
            temp_media_investment.iloc[i]['TV'] = round(row['TV'] / num_weeks, 3)
            temp_media_investment.iloc[i]['Digital'] = round(row['Digital'] / num_weeks, 3)
            temp_media_investment.iloc[i]['Sponsorship'] = round(row['Sponsorship'] / num_weeks, 3)
            temp_media_investment.iloc[i]['Content Marketing'] = round(row['Content Marketing'] / num_weeks, 3)
            temp_media_investment.iloc[i]['Online marketing'] = round(row['Online marketing'] / num_weeks, 3)
            temp_media_investment.iloc[i]['Affiliates'] = round(row[' Affiliates'] / num_weeks, 3)
            temp_media_investment.iloc[i]['SEM'] = round(row['SEM'] / num_weeks, 3)
            temp_media_investment.iloc[i]['Radio'] = round(row['Radio'] / num_weeks, 3)
            temp_media_investment.iloc[i]['Other'] = round(row['Other'] / num_weeks, 3)
            
            i+=1

temp_media_investment.head()
Out[97]:
Year Month Week Total Investment TV Digital Sponsorship Content Marketing Online marketing Affiliates SEM Radio Other
0 2015 7 28 4.265 0.054 0.633 1.854 0 0.332 0.137 1.256 0 0
1 2015 7 29 4.265 0.054 0.633 1.854 0 0.332 0.137 1.256 0 0
2 2015 7 30 4.265 0.054 0.633 1.854 0 0.332 0.137 1.256 0 0
3 2015 7 31 4.265 0.054 0.633 1.854 0 0.332 0.137 1.256 0 0
4 2015 8 32 1.013 0.001 0.256 0.213 0 0.026 0.015 0.503 0 0
In [98]:
temp_media_investment['Week'] = temp_media_investment['Week'].astype('str')
In [99]:
media_investment = temp_media_investment

media_investment.head()
Out[99]:
Year Month Week Total Investment TV Digital Sponsorship Content Marketing Online marketing Affiliates SEM Radio Other
0 2015 7 28 4.265 0.054 0.633 1.854 0 0.332 0.137 1.256 0 0
1 2015 7 29 4.265 0.054 0.633 1.854 0 0.332 0.137 1.256 0 0
2 2015 7 30 4.265 0.054 0.633 1.854 0 0.332 0.137 1.256 0 0
3 2015 7 31 4.265 0.054 0.633 1.854 0 0.332 0.137 1.256 0 0
4 2015 8 32 1.013 0.001 0.256 0.213 0 0.026 0.015 0.503 0 0
In [100]:
# Checking for duplicates
media_investment.duplicated('Week').value_counts()
Out[100]:
False    52
dtype: int64

No duplicates

In [101]:
# Checking for nulls
media_investment.isnull().values.any()
Out[101]:
False

No Nulls

In [102]:
# Dropping Year and Month columns as we won't be needing them anymore

del media_investment['Year']
del media_investment['Month']
media_investment.head()
Out[102]:
Week Total Investment TV Digital Sponsorship Content Marketing Online marketing Affiliates SEM Radio Other
0 28 4.265 0.054 0.633 1.854 0 0.332 0.137 1.256 0 0
1 29 4.265 0.054 0.633 1.854 0 0.332 0.137 1.256 0 0
2 30 4.265 0.054 0.633 1.854 0 0.332 0.137 1.256 0 0
3 31 4.265 0.054 0.633 1.854 0 0.332 0.137 1.256 0 0
4 32 1.013 0.001 0.256 0.213 0 0.026 0.015 0.503 0 0
In [103]:
original_col = media_investment.columns[1:]
original_col
Out[103]:
Index(['Total Investment', 'TV', 'Digital', 'Sponsorship', 'Content Marketing', 'Online marketing', 'Affiliates', 'SEM', 'Radio', 'Other'], dtype='object')

Calculating 8-weeks Exponential Moving Average for all Advertising media channels

In [104]:
def EMA_variables(df,var,n):
    for i in var:
        loc_index = df.columns.get_loc(i) + 1
        df.insert(loc=loc_index,column= i+'_EMA_'+np.str(n),value=df[i].ewm(span=n, adjust=False).mean())
    return df
In [105]:
media_investment = EMA_variables(media_investment,original_col,8) 
media_investment.head()
Out[105]:
Week Total Investment Total Investment_EMA_8 TV TV_EMA_8 Digital Digital_EMA_8 Sponsorship Sponsorship_EMA_8 Content Marketing Content Marketing_EMA_8 Online marketing Online marketing_EMA_8 Affiliates Affiliates_EMA_8 SEM SEM_EMA_8 Radio Radio_EMA_8 Other Other_EMA_8
0 28 4.265 4.265000 0.054 0.054000 0.633 0.633000 1.854 1.854000 0 0.0 0.332 0.332 0.137 0.137000 1.256 1.256000 0 0.0 0 0.0
1 29 4.265 4.265000 0.054 0.054000 0.633 0.633000 1.854 1.854000 0 0.0 0.332 0.332 0.137 0.137000 1.256 1.256000 0 0.0 0 0.0
2 30 4.265 4.265000 0.054 0.054000 0.633 0.633000 1.854 1.854000 0 0.0 0.332 0.332 0.137 0.137000 1.256 1.256000 0 0.0 0 0.0
3 31 4.265 4.265000 0.054 0.054000 0.633 0.633000 1.854 1.854000 0 0.0 0.332 0.332 0.137 0.137000 1.256 1.256000 0 0.0 0 0.0
4 32 1.013 3.542333 0.001 0.042222 0.256 0.549222 0.213 1.489333 0 0.0 0.026 0.264 0.015 0.109889 0.503 1.088667 0 0.0 0 0.0

Calculating 5-weeks Simple Moving Average for all Advertising media channels

In [106]:
def SMA_variables(df,var,n):
    for i in var:
        loc_index = df.columns.get_loc(i) + 1
        df.insert(loc=loc_index,column= i+'_SMA_'+np.str(n),value=df[i].rolling(window=n).mean())
    return df
In [107]:
media_investment = SMA_variables(media_investment,original_col,5) 
media_investment.head()
Out[107]:
Week Total Investment Total Investment_SMA_5 Total Investment_EMA_8 TV TV_SMA_5 TV_EMA_8 Digital Digital_SMA_5 Digital_EMA_8 Sponsorship Sponsorship_SMA_5 Sponsorship_EMA_8 Content Marketing Content Marketing_SMA_5 Content Marketing_EMA_8 Online marketing Online marketing_SMA_5 Online marketing_EMA_8 Affiliates Affiliates_SMA_5 Affiliates_EMA_8 SEM SEM_SMA_5 SEM_EMA_8 Radio Radio_SMA_5 Radio_EMA_8 Other Other_SMA_5 Other_EMA_8
0 28 4.265 NaN 4.265000 0.054 NaN 0.054000 0.633 NaN 0.633000 1.854 NaN 1.854000 0 NaN 0.0 0.332 NaN 0.332 0.137 NaN 0.137000 1.256 NaN 1.256000 0 NaN 0.0 0 NaN 0.0
1 29 4.265 NaN 4.265000 0.054 NaN 0.054000 0.633 NaN 0.633000 1.854 NaN 1.854000 0 NaN 0.0 0.332 NaN 0.332 0.137 NaN 0.137000 1.256 NaN 1.256000 0 NaN 0.0 0 NaN 0.0
2 30 4.265 NaN 4.265000 0.054 NaN 0.054000 0.633 NaN 0.633000 1.854 NaN 1.854000 0 NaN 0.0 0.332 NaN 0.332 0.137 NaN 0.137000 1.256 NaN 1.256000 0 NaN 0.0 0 NaN 0.0
3 31 4.265 NaN 4.265000 0.054 NaN 0.054000 0.633 NaN 0.633000 1.854 NaN 1.854000 0 NaN 0.0 0.332 NaN 0.332 0.137 NaN 0.137000 1.256 NaN 1.256000 0 NaN 0.0 0 NaN 0.0
4 32 1.013 3.6146 3.542333 0.001 0.0434 0.042222 0.256 0.5576 0.549222 0.213 1.5258 1.489333 0 0.0 0.0 0.026 0.2708 0.264 0.015 0.1126 0.109889 0.503 1.1054 1.088667 0 0.0 0.0 0 0.0 0.0

Calculating 3-weeks Simple Moving Average for all Advertising media channels

In [108]:
media_investment = SMA_variables(media_investment,original_col,3) 
media_investment.head()
Out[108]:
Week Total Investment Total Investment_SMA_3 Total Investment_SMA_5 Total Investment_EMA_8 TV TV_SMA_3 TV_SMA_5 TV_EMA_8 Digital Digital_SMA_3 Digital_SMA_5 Digital_EMA_8 Sponsorship Sponsorship_SMA_3 Sponsorship_SMA_5 Sponsorship_EMA_8 Content Marketing Content Marketing_SMA_3 Content Marketing_SMA_5 Content Marketing_EMA_8 Online marketing Online marketing_SMA_3 Online marketing_SMA_5 Online marketing_EMA_8 Affiliates Affiliates_SMA_3 Affiliates_SMA_5 Affiliates_EMA_8 SEM SEM_SMA_3 SEM_SMA_5 SEM_EMA_8 Radio Radio_SMA_3 Radio_SMA_5 Radio_EMA_8 Other Other_SMA_3 Other_SMA_5 Other_EMA_8
0 28 4.265 NaN NaN 4.265000 0.054 NaN NaN 0.054000 0.633 NaN NaN 0.633000 1.854 NaN NaN 1.854000 0 NaN NaN 0.0 0.332 NaN NaN 0.332 0.137 NaN NaN 0.137000 1.256 NaN NaN 1.256000 0 NaN NaN 0.0 0 NaN NaN 0.0
1 29 4.265 NaN NaN 4.265000 0.054 NaN NaN 0.054000 0.633 NaN NaN 0.633000 1.854 NaN NaN 1.854000 0 NaN NaN 0.0 0.332 NaN NaN 0.332 0.137 NaN NaN 0.137000 1.256 NaN NaN 1.256000 0 NaN NaN 0.0 0 NaN NaN 0.0
2 30 4.265 4.265 NaN 4.265000 0.054 0.054000 NaN 0.054000 0.633 0.633000 NaN 0.633000 1.854 1.854 NaN 1.854000 0 0.0 NaN 0.0 0.332 0.332 NaN 0.332 0.137 0.137000 NaN 0.137000 1.256 1.256 NaN 1.256000 0 0.0 NaN 0.0 0 0.0 NaN 0.0
3 31 4.265 4.265 NaN 4.265000 0.054 0.054000 NaN 0.054000 0.633 0.633000 NaN 0.633000 1.854 1.854 NaN 1.854000 0 0.0 NaN 0.0 0.332 0.332 NaN 0.332 0.137 0.137000 NaN 0.137000 1.256 1.256 NaN 1.256000 0 0.0 NaN 0.0 0 0.0 NaN 0.0
4 32 1.013 3.181 3.6146 3.542333 0.001 0.036333 0.0434 0.042222 0.256 0.507333 0.5576 0.549222 0.213 1.307 1.5258 1.489333 0 0.0 0.0 0.0 0.026 0.230 0.2708 0.264 0.015 0.096333 0.1126 0.109889 0.503 1.005 1.1054 1.088667 0 0.0 0.0 0.0 0 0.0 0.0 0.0

Calculating Ad Stock values for all Advertising media

Generating TV Ad Stock values

In [109]:
def calculate_ad_stocks(data, engagement_factor):
    
    """initialize ad stock vectors"""
    TV_ad_stock_vector = []                     
    Digital_ad_stock_vector = []
    Sponsorship_ad_stock_vector = []
    Content_Marketing_ad_stock_vector = []
    Online_marketing_ad_stock_vector = []
    Affiliates_ad_stock_vector = []
    SEM_ad_stock_vector = []
    Radio_ad_stock_vector = []
    Other_ad_stock_vector = []
    Total_Investment_ad_stock_vector = []
    
    """initialize ad_stock_value"""
    tv_ad_stock_value = 0 
    Digital_ad_stock_value = 0
    Sponsorship_ad_stock_value = 0
    Content_Marketing_ad_stock_value = 0
    Online_marketing_ad_stock_value = 0
    Affiliates_ad_stock_value = 0
    SEM_ad_stock_value = 0
    Radio_ad_stock_value = 0
    Other_ad_stock_value = 0
    Total_Investment_ad_stock_value = 0
    
    """loop through dataset to calculate adstock values. The formula for adstock is: At = Xt + adstock rate * At-1."""
    for index, row in data.iterrows():
        tv_ad_stock_value = row['TV'] + engagement_factor * tv_ad_stock_value
        TV_ad_stock_vector.append(tv_ad_stock_value)
        Digital_ad_stock_value = row['Digital'] + engagement_factor * Digital_ad_stock_value
        Digital_ad_stock_vector.append(Digital_ad_stock_value)
        Sponsorship_ad_stock_value = row['Sponsorship'] + engagement_factor * Sponsorship_ad_stock_value
        Sponsorship_ad_stock_vector.append(Sponsorship_ad_stock_value)
        Content_Marketing_ad_stock_value = row['Content Marketing'] + engagement_factor * Content_Marketing_ad_stock_value
        Content_Marketing_ad_stock_vector.append(Content_Marketing_ad_stock_value)    
        Online_marketing_ad_stock_value = row['Online marketing'] + engagement_factor * Online_marketing_ad_stock_value
        Online_marketing_ad_stock_vector.append(Online_marketing_ad_stock_value)
        Affiliates_ad_stock_value = row['Affiliates'] + engagement_factor * Affiliates_ad_stock_value
        Affiliates_ad_stock_vector.append(Affiliates_ad_stock_value)
        SEM_ad_stock_value = row['SEM'] + engagement_factor * SEM_ad_stock_value
        SEM_ad_stock_vector.append(SEM_ad_stock_value)
        Radio_ad_stock_value = row['Radio'] + engagement_factor * Radio_ad_stock_value
        Radio_ad_stock_vector.append(Radio_ad_stock_value)
        Other_ad_stock_value = row['Other'] + engagement_factor * Other_ad_stock_value
        Other_ad_stock_vector.append(Other_ad_stock_value)
        Total_Investment_ad_stock_value = row['Total Investment'] + engagement_factor * Total_Investment_ad_stock_value
        Total_Investment_ad_stock_vector.append(Total_Investment_ad_stock_value)
    
    """add ad stock vector to dataset"""
    loc_index = data.columns.get_loc('TV_EMA_8') + 1
    data.insert(loc=loc_index,column='TV_Ad_Stock',value=TV_ad_stock_vector)
    loc_index = data.columns.get_loc('Digital_EMA_8') + 1
    data.insert(loc=loc_index,column='Digital_Ad_Stock',value=Digital_ad_stock_vector)
    loc_index = data.columns.get_loc('Sponsorship_EMA_8') + 1
    data.insert(loc=loc_index,column='Sponsorship_Ad_Stock',value=Sponsorship_ad_stock_vector)
    loc_index = data.columns.get_loc('Content Marketing_EMA_8') + 1
    data.insert(loc=loc_index,column='Content_Marketing_Ad_Stock',value=Content_Marketing_ad_stock_vector)
    loc_index = data.columns.get_loc('Online marketing_EMA_8') + 1
    data.insert(loc=loc_index,column='Online_marketing_Ad_Stock',value=Online_marketing_ad_stock_vector)
    loc_index = data.columns.get_loc('Affiliates_EMA_8') + 1
    data.insert(loc=loc_index,column='Affiliates_Ad_Stock',value=Affiliates_ad_stock_vector)
    loc_index = data.columns.get_loc('SEM_EMA_8') + 1
    data.insert(loc=loc_index,column='SEM_Ad_Stock',value=SEM_ad_stock_vector)
    loc_index = data.columns.get_loc('Radio_EMA_8') + 1
    data.insert(loc=loc_index,column='Radio_Ad_Stock',value=Radio_ad_stock_vector)
    loc_index = data.columns.get_loc('Other_EMA_8') + 1
    data.insert(loc=loc_index,column='Other_Ad_Stock',value=Other_ad_stock_vector)
    loc_index = data.columns.get_loc('Total Investment_EMA_8') + 1
    data.insert(loc=loc_index,column='Total_Investment_Ad_Stock',value=Total_Investment_ad_stock_vector)
    
    return data

Assuming the value of Adstock rate(engagement factor)

In [110]:
media_investment = calculate_ad_stocks(data=media_investment, engagement_factor=0.6)
media_investment.head(10)
Out[110]:
Week Total Investment Total Investment_SMA_3 Total Investment_SMA_5 Total Investment_EMA_8 Total_Investment_Ad_Stock TV TV_SMA_3 TV_SMA_5 TV_EMA_8 TV_Ad_Stock Digital Digital_SMA_3 Digital_SMA_5 Digital_EMA_8 Digital_Ad_Stock Sponsorship Sponsorship_SMA_3 Sponsorship_SMA_5 Sponsorship_EMA_8 Sponsorship_Ad_Stock Content Marketing Content Marketing_SMA_3 Content Marketing_SMA_5 Content Marketing_EMA_8 Content_Marketing_Ad_Stock Online marketing Online marketing_SMA_3 Online marketing_SMA_5 Online marketing_EMA_8 Online_marketing_Ad_Stock Affiliates Affiliates_SMA_3 Affiliates_SMA_5 Affiliates_EMA_8 Affiliates_Ad_Stock SEM SEM_SMA_3 SEM_SMA_5 SEM_EMA_8 SEM_Ad_Stock Radio Radio_SMA_3 Radio_SMA_5 Radio_EMA_8 Radio_Ad_Stock Other Other_SMA_3 Other_SMA_5 Other_EMA_8 Other_Ad_Stock
0 28 4.265 NaN NaN 4.265000 4.265000 0.054 NaN NaN 0.054000 0.054000 0.633 NaN NaN 0.633000 0.633000 1.854 NaN NaN 1.854000 1.854000 0 NaN NaN 0.000 0.000 0.332 NaN NaN 0.332000 0.332000 0.137 NaN NaN 0.137000 0.137000 1.256 NaN NaN 1.256000 1.256000 0 NaN NaN 0.0 0.0 0 NaN NaN 0.0 0.0
1 29 4.265 NaN NaN 4.265000 6.824000 0.054 NaN NaN 0.054000 0.086400 0.633 NaN NaN 0.633000 1.012800 1.854 NaN NaN 1.854000 2.966400 0 NaN NaN 0.000 0.000 0.332 NaN NaN 0.332000 0.531200 0.137 NaN NaN 0.137000 0.219200 1.256 NaN NaN 1.256000 2.009600 0 NaN NaN 0.0 0.0 0 NaN NaN 0.0 0.0
2 30 4.265 4.265000 NaN 4.265000 8.359400 0.054 0.054000 NaN 0.054000 0.105840 0.633 0.633000 NaN 0.633000 1.240680 1.854 1.854000 NaN 1.854000 3.633840 0 0.000 NaN 0.000 0.000 0.332 0.332000 NaN 0.332000 0.650720 0.137 0.137000 NaN 0.137000 0.268520 1.256 1.256000 NaN 1.256000 2.461760 0 0.0 NaN 0.0 0.0 0 0.0 NaN 0.0 0.0
3 31 4.265 4.265000 NaN 4.265000 9.280640 0.054 0.054000 NaN 0.054000 0.117504 0.633 0.633000 NaN 0.633000 1.377408 1.854 1.854000 NaN 1.854000 4.034304 0 0.000 NaN 0.000 0.000 0.332 0.332000 NaN 0.332000 0.722432 0.137 0.137000 NaN 0.137000 0.298112 1.256 1.256000 NaN 1.256000 2.733056 0 0.0 NaN 0.0 0.0 0 0.0 NaN 0.0 0.0
4 32 1.013 3.181000 3.6146 3.542333 6.581384 0.001 0.036333 0.0434 0.042222 0.071502 0.256 0.507333 0.5576 0.549222 1.082445 0.213 1.307000 1.5258 1.489333 2.633582 0 0.000 0.0000 0.000 0.000 0.026 0.230000 0.2708 0.264000 0.459459 0.015 0.096333 0.1126 0.109889 0.193867 0.503 1.005000 1.1054 1.088667 2.142834 0 0.0 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0
5 33 1.013 2.097000 2.9642 2.980259 4.961830 0.001 0.018667 0.0328 0.033062 0.043901 0.256 0.381667 0.4822 0.484062 0.905467 0.213 0.760000 1.1976 1.205704 1.793149 0 0.000 0.0000 0.000 0.000 0.026 0.128000 0.2096 0.211111 0.301676 0.015 0.055667 0.0882 0.088802 0.131320 0.503 0.754000 0.9548 0.958519 1.788700 0 0.0 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0
6 34 1.013 1.013000 2.3138 2.543091 3.990098 0.001 0.001000 0.0222 0.025937 0.027341 0.256 0.256000 0.4068 0.433381 0.799280 0.213 0.213000 0.8694 0.985103 1.288890 0 0.000 0.0000 0.000 0.000 0.026 0.026000 0.1484 0.169975 0.207005 0.015 0.015000 0.0638 0.072402 0.093792 0.503 0.503000 0.8042 0.857292 1.576220 0 0.0 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0
7 35 1.013 1.013000 1.6634 2.203070 3.407059 0.001 0.001000 0.0116 0.020395 0.017405 0.256 0.256000 0.3314 0.393963 0.735568 0.213 0.213000 0.5412 0.813524 0.986334 0 0.000 0.0000 0.000 0.000 0.026 0.026000 0.0872 0.137981 0.150203 0.015 0.015000 0.0394 0.059646 0.071275 0.503 0.503000 0.6536 0.778561 1.448732 0 0.0 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0
8 36 1.013 1.013000 1.0130 1.938610 3.057235 0.001 0.001000 0.0010 0.016085 0.011443 0.256 0.256000 0.2560 0.363305 0.697341 0.213 0.213000 0.2130 0.680075 0.804800 0 0.000 0.0000 0.000 0.000 0.026 0.026000 0.0260 0.113096 0.116122 0.015 0.015000 0.0150 0.049725 0.057765 0.503 0.503000 0.5030 0.717325 1.372239 0 0.0 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0
9 37 24.064 8.696667 5.6232 6.855364 25.898341 0.97 0.324000 0.1948 0.228066 0.976866 0.339 0.283667 0.2726 0.357904 0.757405 15.697 5.374333 3.3098 4.017169 16.179880 0.153 0.051 0.0306 0.034 0.153 4.095 1.382333 0.8398 0.997964 4.164673 1.26 0.430000 0.2640 0.318675 1.294659 1.551 0.852333 0.7126 0.902586 2.374344 0 0.0 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0
In [111]:
media_investment.fillna(value=0, inplace=True)
media_investment.head(10)
Out[111]:
Week Total Investment Total Investment_SMA_3 Total Investment_SMA_5 Total Investment_EMA_8 Total_Investment_Ad_Stock TV TV_SMA_3 TV_SMA_5 TV_EMA_8 TV_Ad_Stock Digital Digital_SMA_3 Digital_SMA_5 Digital_EMA_8 Digital_Ad_Stock Sponsorship Sponsorship_SMA_3 Sponsorship_SMA_5 Sponsorship_EMA_8 Sponsorship_Ad_Stock Content Marketing Content Marketing_SMA_3 Content Marketing_SMA_5 Content Marketing_EMA_8 Content_Marketing_Ad_Stock Online marketing Online marketing_SMA_3 Online marketing_SMA_5 Online marketing_EMA_8 Online_marketing_Ad_Stock Affiliates Affiliates_SMA_3 Affiliates_SMA_5 Affiliates_EMA_8 Affiliates_Ad_Stock SEM SEM_SMA_3 SEM_SMA_5 SEM_EMA_8 SEM_Ad_Stock Radio Radio_SMA_3 Radio_SMA_5 Radio_EMA_8 Radio_Ad_Stock Other Other_SMA_3 Other_SMA_5 Other_EMA_8 Other_Ad_Stock
0 28 4.265 0.000000 0.0000 4.265000 4.265000 0.054 0.000000 0.0000 0.054000 0.054000 0.633 0.000000 0.0000 0.633000 0.633000 1.854 0.000000 0.0000 1.854000 1.854000 0.000 0.000 0.0000 0.000 0.000 0.332 0.000000 0.0000 0.332000 0.332000 0.137 0.000000 0.0000 0.137000 0.137000 1.256 0.000000 0.0000 1.256000 1.256000 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1 29 4.265 0.000000 0.0000 4.265000 6.824000 0.054 0.000000 0.0000 0.054000 0.086400 0.633 0.000000 0.0000 0.633000 1.012800 1.854 0.000000 0.0000 1.854000 2.966400 0.000 0.000 0.0000 0.000 0.000 0.332 0.000000 0.0000 0.332000 0.531200 0.137 0.000000 0.0000 0.137000 0.219200 1.256 0.000000 0.0000 1.256000 2.009600 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
2 30 4.265 4.265000 0.0000 4.265000 8.359400 0.054 0.054000 0.0000 0.054000 0.105840 0.633 0.633000 0.0000 0.633000 1.240680 1.854 1.854000 0.0000 1.854000 3.633840 0.000 0.000 0.0000 0.000 0.000 0.332 0.332000 0.0000 0.332000 0.650720 0.137 0.137000 0.0000 0.137000 0.268520 1.256 1.256000 0.0000 1.256000 2.461760 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
3 31 4.265 4.265000 0.0000 4.265000 9.280640 0.054 0.054000 0.0000 0.054000 0.117504 0.633 0.633000 0.0000 0.633000 1.377408 1.854 1.854000 0.0000 1.854000 4.034304 0.000 0.000 0.0000 0.000 0.000 0.332 0.332000 0.0000 0.332000 0.722432 0.137 0.137000 0.0000 0.137000 0.298112 1.256 1.256000 0.0000 1.256000 2.733056 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
4 32 1.013 3.181000 3.6146 3.542333 6.581384 0.001 0.036333 0.0434 0.042222 0.071502 0.256 0.507333 0.5576 0.549222 1.082445 0.213 1.307000 1.5258 1.489333 2.633582 0.000 0.000 0.0000 0.000 0.000 0.026 0.230000 0.2708 0.264000 0.459459 0.015 0.096333 0.1126 0.109889 0.193867 0.503 1.005000 1.1054 1.088667 2.142834 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
5 33 1.013 2.097000 2.9642 2.980259 4.961830 0.001 0.018667 0.0328 0.033062 0.043901 0.256 0.381667 0.4822 0.484062 0.905467 0.213 0.760000 1.1976 1.205704 1.793149 0.000 0.000 0.0000 0.000 0.000 0.026 0.128000 0.2096 0.211111 0.301676 0.015 0.055667 0.0882 0.088802 0.131320 0.503 0.754000 0.9548 0.958519 1.788700 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
6 34 1.013 1.013000 2.3138 2.543091 3.990098 0.001 0.001000 0.0222 0.025937 0.027341 0.256 0.256000 0.4068 0.433381 0.799280 0.213 0.213000 0.8694 0.985103 1.288890 0.000 0.000 0.0000 0.000 0.000 0.026 0.026000 0.1484 0.169975 0.207005 0.015 0.015000 0.0638 0.072402 0.093792 0.503 0.503000 0.8042 0.857292 1.576220 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
7 35 1.013 1.013000 1.6634 2.203070 3.407059 0.001 0.001000 0.0116 0.020395 0.017405 0.256 0.256000 0.3314 0.393963 0.735568 0.213 0.213000 0.5412 0.813524 0.986334 0.000 0.000 0.0000 0.000 0.000 0.026 0.026000 0.0872 0.137981 0.150203 0.015 0.015000 0.0394 0.059646 0.071275 0.503 0.503000 0.6536 0.778561 1.448732 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
8 36 1.013 1.013000 1.0130 1.938610 3.057235 0.001 0.001000 0.0010 0.016085 0.011443 0.256 0.256000 0.2560 0.363305 0.697341 0.213 0.213000 0.2130 0.680075 0.804800 0.000 0.000 0.0000 0.000 0.000 0.026 0.026000 0.0260 0.113096 0.116122 0.015 0.015000 0.0150 0.049725 0.057765 0.503 0.503000 0.5030 0.717325 1.372239 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
9 37 24.064 8.696667 5.6232 6.855364 25.898341 0.970 0.324000 0.1948 0.228066 0.976866 0.339 0.283667 0.2726 0.357904 0.757405 15.697 5.374333 3.3098 4.017169 16.179880 0.153 0.051 0.0306 0.034 0.153 4.095 1.382333 0.8398 0.997964 4.164673 1.260 0.430000 0.2640 0.318675 1.294659 1.551 0.852333 0.7126 0.902586 2.374344 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
In [112]:
media_investment.shape
Out[112]:
(52, 51)

NPS Score

We will generate weekly data from Year and Month for the nps score data
In [113]:
net_promoter_score
Out[113]:
score July'15 Aug'15 Sept'15 Oct'15 Nov'15 Dec'15 Jan'16 Feb'16 Mar'16 Apr'16 May'16 June'16
0 NPS 54.599588 59.987101 46.925419 44.398389 47.0 45.8 47.093031 50.327406 49.02055 51.827605 47.306951 50.516687
1 Stock Index 1177.000000 1206.000000 1101.000000 1210.000000 1233.0 1038.0 1052.000000 1222.000000 1015.00000 1242.000000 1228.000000 1194.000000
Getting the dataset into suitable format
In [114]:
# resetting index
net_promoter_score.reset_index(drop=True, inplace=True)

# Transposing the dataframe
net_promoter_score = net_promoter_score.T

# resetting index
net_promoter_score.reset_index(drop=False, inplace=True)

# renaming columns
net_promoter_score.columns = ['Month', 'NPS', 'Stock Index']
net_promoter_score.drop(net_promoter_score.index[0], inplace=True)
net_promoter_score
Out[114]:
Month NPS Stock Index
1 July'15 54.5996 1177
2 Aug'15 59.9871 1206
3 Sept'15 46.9254 1101
4 Oct'15 44.3984 1210
5 Nov'15 47 1233
6 Dec'15 45.8 1038
7 Jan'16 47.093 1052
8 Feb'16 50.3274 1222
9 Mar'16 49.0206 1015
10 Apr'16 51.8276 1242
11 May'16 47.307 1228
12 June'16 50.5167 1194
In [115]:
# Temp DataFrame

temp_nps = pd.DataFrame(index=range(total_weeks), columns=['Year', 'Month', 'Week', 'NPS', 'Stock Index'])

temp_nps
Out[115]:
Year Month Week NPS Stock Index
0 NaN NaN NaN NaN NaN
1 NaN NaN NaN NaN NaN
2 NaN NaN NaN NaN NaN
3 NaN NaN NaN NaN NaN
4 NaN NaN NaN NaN NaN
5 NaN NaN NaN NaN NaN
6 NaN NaN NaN NaN NaN
7 NaN NaN NaN NaN NaN
8 NaN NaN NaN NaN NaN
9 NaN NaN NaN NaN NaN
10 NaN NaN NaN NaN NaN
11 NaN NaN NaN NaN NaN
12 NaN NaN NaN NaN NaN
13 NaN NaN NaN NaN NaN
14 NaN NaN NaN NaN NaN
15 NaN NaN NaN NaN NaN
16 NaN NaN NaN NaN NaN
17 NaN NaN NaN NaN NaN
18 NaN NaN NaN NaN NaN
19 NaN NaN NaN NaN NaN
20 NaN NaN NaN NaN NaN
21 NaN NaN NaN NaN NaN
22 NaN NaN NaN NaN NaN
23 NaN NaN NaN NaN NaN
24 NaN NaN NaN NaN NaN
25 NaN NaN NaN NaN NaN
26 NaN NaN NaN NaN NaN
27 NaN NaN NaN NaN NaN
28 NaN NaN NaN NaN NaN
29 NaN NaN NaN NaN NaN
30 NaN NaN NaN NaN NaN
31 NaN NaN NaN NaN NaN
32 NaN NaN NaN NaN NaN
33 NaN NaN NaN NaN NaN
34 NaN NaN NaN NaN NaN
35 NaN NaN NaN NaN NaN
36 NaN NaN NaN NaN NaN
37 NaN NaN NaN NaN NaN
38 NaN NaN NaN NaN NaN
39 NaN NaN NaN NaN NaN
40 NaN NaN NaN NaN NaN
41 NaN NaN NaN NaN NaN
42 NaN NaN NaN NaN NaN
43 NaN NaN NaN NaN NaN
44 NaN NaN NaN NaN NaN
45 NaN NaN NaN NaN NaN
46 NaN NaN NaN NaN NaN
47 NaN NaN NaN NaN NaN
48 NaN NaN NaN NaN NaN
49 NaN NaN NaN NaN NaN
50 NaN NaN NaN NaN NaN
51 NaN NaN NaN NaN NaN

We use the same NPS Score and Stock Index values for each week of a month which is the NPS Score and Stock Index of the entire month

In [116]:
i = 0

for index, row in net_promoter_score.iterrows():

    if '15' in row['Month']:
        
        month = re.split("[']", row['Month'])[0][0:3]
    
        for week in dict_name_2015[month]:        
            
            temp_nps.iloc[i].Year = 2015
            temp_nps.iloc[i].Month = month
            temp_nps.iloc[i].Week = week
            temp_nps.iloc[i].NPS = row['NPS']
            temp_nps.iloc[i]['Stock Index'] = row['Stock Index']
            
            i+=1
        
    elif '16' in row['Month']:

        month = re.split("[']", row['Month'])[0][0:3]
    
        for week in dict_name_2016[month]:        
            
            temp_nps.iloc[i].Year = 2016
            temp_nps.iloc[i].Month = month
            temp_nps.iloc[i].Week = week
            temp_nps.iloc[i].NPS = row['NPS']
            temp_nps.iloc[i]['Stock Index'] = row['Stock Index']
            
            i+=1

temp_nps.head()
Out[116]:
Year Month Week NPS Stock Index
0 2015 Jul 28 54.5996 1177
1 2015 Jul 29 54.5996 1177
2 2015 Jul 30 54.5996 1177
3 2015 Jul 31 54.5996 1177
4 2015 Aug 32 59.9871 1206
In [117]:
temp_nps['Week'] = temp_nps['Week'].astype('str')
In [118]:
net_promoter_score = temp_nps

net_promoter_score
Out[118]:
Year Month Week NPS Stock Index
0 2015 Jul 28 54.5996 1177
1 2015 Jul 29 54.5996 1177
2 2015 Jul 30 54.5996 1177
3 2015 Jul 31 54.5996 1177
4 2015 Aug 32 59.9871 1206
5 2015 Aug 33 59.9871 1206
6 2015 Aug 34 59.9871 1206
7 2015 Aug 35 59.9871 1206
8 2015 Aug 36 59.9871 1206
9 2015 Sep 37 46.9254 1101
10 2015 Sep 38 46.9254 1101
11 2015 Sep 39 46.9254 1101
12 2015 Sep 40 46.9254 1101
13 2015 Oct 41 44.3984 1210
14 2015 Oct 42 44.3984 1210
15 2015 Oct 43 44.3984 1210
16 2015 Oct 44 44.3984 1210
17 2015 Nov 45 47 1233
18 2015 Nov 46 47 1233
19 2015 Nov 47 47 1233
20 2015 Nov 48 47 1233
21 2015 Nov 49 47 1233
22 2015 Dec 50 45.8 1038
23 2015 Dec 51 45.8 1038
24 2015 Dec 52 45.8 1038
25 2015 Dec 53 45.8 1038
26 2016 Jan 1 47.093 1052
27 2016 Jan 2 47.093 1052
28 2016 Jan 3 47.093 1052
29 2016 Jan 4 47.093 1052
30 2016 Feb 5 50.3274 1222
31 2016 Feb 6 50.3274 1222
32 2016 Feb 7 50.3274 1222
33 2016 Feb 8 50.3274 1222
34 2016 Feb 9 50.3274 1222
35 2016 Mar 10 49.0206 1015
36 2016 Mar 11 49.0206 1015
37 2016 Mar 12 49.0206 1015
38 2016 Mar 13 49.0206 1015
39 2016 Apr 14 51.8276 1242
40 2016 Apr 15 51.8276 1242
41 2016 Apr 16 51.8276 1242
42 2016 Apr 17 51.8276 1242
43 2016 May 18 47.307 1228
44 2016 May 19 47.307 1228
45 2016 May 20 47.307 1228
46 2016 May 21 47.307 1228
47 2016 May 22 47.307 1228
48 2016 Jun 23 50.5167 1194
49 2016 Jun 24 50.5167 1194
50 2016 Jun 25 50.5167 1194
51 2016 Jun 26 50.5167 1194
In [119]:
net_promoter_score.duplicated('Week').value_counts()
Out[119]:
False    52
dtype: int64

No duplicates

In [120]:
net_promoter_score.isnull().values.any()
Out[120]:
False

No Nulls

In [121]:
# Dropping Yeat and Month columns as we won't be needing them anymore

del net_promoter_score['Year']
del net_promoter_score['Month']
In [122]:
net_promoter_score.head()
Out[122]:
Week NPS Stock Index
0 28 54.5996 1177
1 29 54.5996 1177
2 30 54.5996 1177
3 31 54.5996 1177
4 32 59.9871 1206
In [123]:
nps_original_col = net_promoter_score.columns[1:]
nps_original_col
Out[123]:
Index(['NPS', 'Stock Index'], dtype='object')

Calculating 5-weeks Simple Moving Average for NPS and Stock_Index

In [124]:
net_promoter_score = SMA_variables(net_promoter_score,nps_original_col,5) 
net_promoter_score.head()
Out[124]:
Week NPS NPS_SMA_5 Stock Index Stock Index_SMA_5
0 28 54.5996 NaN 1177 NaN
1 29 54.5996 NaN 1177 NaN
2 30 54.5996 NaN 1177 NaN
3 31 54.5996 NaN 1177 NaN
4 32 59.9871 55.677091 1206 1182.8

Calculating 3-weeks Simple Moving Average for NPS and Stock_Index

In [125]:
net_promoter_score = SMA_variables(net_promoter_score,nps_original_col,3) 
net_promoter_score.head()
Out[125]:
Week NPS NPS_SMA_3 NPS_SMA_5 Stock Index Stock Index_SMA_3 Stock Index_SMA_5
0 28 54.5996 NaN NaN 1177 NaN NaN
1 29 54.5996 NaN NaN 1177 NaN NaN
2 30 54.5996 54.599588 NaN 1177 1177.000000 NaN
3 31 54.5996 54.599588 NaN 1177 1177.000000 NaN
4 32 59.9871 56.395426 55.677091 1206 1186.666667 1182.8
In [126]:
net_promoter_score.fillna(value=0, inplace=True)
net_promoter_score.head()
Out[126]:
Week NPS NPS_SMA_3 NPS_SMA_5 Stock Index Stock Index_SMA_3 Stock Index_SMA_5
0 28 54.599588 0.000000 0.000000 1177.0 0.000000 0.0
1 29 54.599588 0.000000 0.000000 1177.0 0.000000 0.0
2 30 54.599588 54.599588 0.000000 1177.0 1177.000000 0.0
3 31 54.599588 54.599588 0.000000 1177.0 1177.000000 0.0
4 32 59.987101 56.395426 55.677091 1206.0 1186.666667 1182.8

Sale Calendar

We will generate weekly data from Year and Month for the sale calendar data

In [127]:
sale_calendar
Out[127]:
Unnamed: 1 Sales Calendar
0 2015.0 (18-19th July)
1 2015.0 (15-17th Aug)
2 2015.0 (28-30th Aug)
3 2015.0 (17-15th Oct)
4 2015.0 (7-14th Nov)
5 2015.0 (25th Dec'15 - 3rd Jan'16)
6 2016.0 (20-22 Jan)
7 2016.0 (1-2 Feb)
8 2016.0 (20-21 Feb)
9 2016.0 (14-15 Feb)
10 2016.0 (7-9 Mar)
11 2016.0 (25-27 May)
In [128]:
# Getting the data into required format

sale_calendar.columns = ['Year', 'Sale']

sale_calendar.Year = sale_calendar.Year.apply(lambda x: int(x))
sale_calendar.Sale = sale_calendar.Sale.apply(lambda x: x.replace('th','').strip())
sale_calendar.Sale = sale_calendar.Sale.apply(lambda x: x.replace('rd','').strip())
sale_calendar.Sale = sale_calendar.Sale.apply(lambda x: x.replace('(','').strip())
sale_calendar.Sale = sale_calendar.Sale.apply(lambda x: x.replace(')','').strip())

# Splitting 25-31 Dec into two different rows for the ease of weekly data generation

sale_calendar.iloc[5, sale_calendar.columns.get_loc('Sale')] = '25-31 Dec'
sale_calendar = sale_calendar.append({'Year' : 2016 , 'Sale' : '1-3 Jan'} , ignore_index=True)

sale_calendar.sort_values(by=['Year'], ascending = True, inplace = True)

sale_calendar
Out[128]:
Year Sale
0 2015 18-19 July
1 2015 15-17 Aug
2 2015 28-30 Aug
3 2015 17-15 Oct
4 2015 7-14 Nov
5 2015 25-31 Dec
6 2016 20-22 Jan
7 2016 1-2 Feb
8 2016 20-21 Feb
9 2016 14-15 Feb
10 2016 7-9 Mar
11 2016 25-27 May
12 2016 1-3 Jan
In [129]:
# Dropping '17-15 Oct' entry assuming it is erroneous

sale_calendar.drop(sale_calendar.index[3], inplace=True)

sale_calendar
Out[129]:
Year Sale
0 2015 18-19 July
1 2015 15-17 Aug
2 2015 28-30 Aug
4 2015 7-14 Nov
5 2015 25-31 Dec
6 2016 20-22 Jan
7 2016 1-2 Feb
8 2016 20-21 Feb
9 2016 14-15 Feb
10 2016 7-9 Mar
11 2016 25-27 May
12 2016 1-3 Jan
In [130]:
# Temp DataFrame

temp_sale_calendar = pd.DataFrame(index=range(total_weeks), columns=['Year', 'Month', 'Week', 'Sale'])

temp_sale_calendar
Out[130]:
Year Month Week Sale
0 NaN NaN NaN NaN
1 NaN NaN NaN NaN
2 NaN NaN NaN NaN
3 NaN NaN NaN NaN
4 NaN NaN NaN NaN
5 NaN NaN NaN NaN
6 NaN NaN NaN NaN
7 NaN NaN NaN NaN
8 NaN NaN NaN NaN
9 NaN NaN NaN NaN
10 NaN NaN NaN NaN
11 NaN NaN NaN NaN
12 NaN NaN NaN NaN
13 NaN NaN NaN NaN
14 NaN NaN NaN NaN
15 NaN NaN NaN NaN
16 NaN NaN NaN NaN
17 NaN NaN NaN NaN
18 NaN NaN NaN NaN
19 NaN NaN NaN NaN
20 NaN NaN NaN NaN
21 NaN NaN NaN NaN
22 NaN NaN NaN NaN
23 NaN NaN NaN NaN
24 NaN NaN NaN NaN
25 NaN NaN NaN NaN
26 NaN NaN NaN NaN
27 NaN NaN NaN NaN
28 NaN NaN NaN NaN
29 NaN NaN NaN NaN
30 NaN NaN NaN NaN
31 NaN NaN NaN NaN
32 NaN NaN NaN NaN
33 NaN NaN NaN NaN
34 NaN NaN NaN NaN
35 NaN NaN NaN NaN
36 NaN NaN NaN NaN
37 NaN NaN NaN NaN
38 NaN NaN NaN NaN
39 NaN NaN NaN NaN
40 NaN NaN NaN NaN
41 NaN NaN NaN NaN
42 NaN NaN NaN NaN
43 NaN NaN NaN NaN
44 NaN NaN NaN NaN
45 NaN NaN NaN NaN
46 NaN NaN NaN NaN
47 NaN NaN NaN NaN
48 NaN NaN NaN NaN
49 NaN NaN NaN NaN
50 NaN NaN NaN NaN
51 NaN NaN NaN NaN

%V - ISO 8601 week as a decimal number with Monday as the first day of the week.

We find the number of days in each week of a month on which there was a sale

In [131]:
i = 0
import datetime

for index, row in sale_calendar.iterrows():

    week_list = []
    month_list = []
    year_list = []

    date1, date2, month = re.split("[- ]", row.Sale)
    month = month[0:3] # Taking only the first 3 character of the month name
    year = row.Year
    
#     print("Date => {}-{}-{}".format(date1, month, year))
#     print(datetime.datetime.strptime('{}-{}-{}'.format(date1, month, year), "%d-%b-%Y").strftime("%V"))
#     print("Date => {}-{}-{}".format(date2, month, year))
#     print(datetime.datetime.strptime('{}-{}-{}'.format(date2, month, year), "%d-%b-%Y").strftime("%V"))
    
    date = int(date1)
    
    while date <= int(date2):
        
        # Extracting the week numbers of the range of dates in each month
        week = datetime.datetime.strptime('{}-{}-{}'.format(date, month, year), "%d-%b-%Y").strftime("%V")
        week_list.append(int(week))
        date+=1
    
    week_dict = dict.fromkeys(week_list, 0) # Generating a week list dictionary with default value as 0

    for j in week_list:
        week_dict[j] = week_dict[j] + 1

    for key, value in week_dict.items():
        temp_sale_calendar.iloc[i].Year = year
        temp_sale_calendar.iloc[i].Month = month
        temp_sale_calendar.iloc[i].Week = key
        temp_sale_calendar.iloc[i].Sale = value
        
        i+=1
    
temp_sale_calendar.dropna(axis=0, how='all', inplace=True) # Drop all rows which has all Nulls
temp_sale_calendar.sort_values(by=['Week'], ascending = True, inplace = True)

# Handling the special cases of 53 and 7 week numbers which appear twice
temp_sale_calendar.loc[temp_sale_calendar['Week'] == 53, 'Sale'] = temp_sale_calendar.loc[temp_sale_calendar['Week'] == 53, 'Sale'].sum()
temp_sale_calendar.loc[temp_sale_calendar['Week'] == 7, 'Sale'] = temp_sale_calendar.loc[temp_sale_calendar['Week'] == 7, 'Sale'].sum()

# Dropping the redundant 53 and 7 week numbers
temp_sale_calendar.drop_duplicates(subset=['Week'], keep='first', inplace=True)
temp_sale_calendar.reset_index(drop=True, inplace=True)

temp_sale_calendar.head()
Out[131]:
Year Month Week Sale
0 2016 Jan 3 3
1 2016 Feb 5 2
2 2016 Feb 6 1
3 2016 Feb 7 3
4 2016 Mar 10 3
In [132]:
temp_sale_calendar['Week'] = temp_sale_calendar['Week'].astype('str')
In [133]:
sale_calendar = temp_sale_calendar

sale_calendar.head()
Out[133]:
Year Month Week Sale
0 2016 Jan 3 3
1 2016 Feb 5 2
2 2016 Feb 6 1
3 2016 Feb 7 3
4 2016 Mar 10 3
In [134]:
sale_calendar.duplicated('Week').value_counts()
Out[134]:
False    14
dtype: int64

No duplicates

In [135]:
sale_calendar.isnull().values.any()
Out[135]:
False

No Nulls

In [136]:
# Dropping Yeat and Month columns as we won't be needing them anymore

del sale_calendar['Year']
del sale_calendar['Month']

Climate 2015

We will generate weekly data from date for the climate data of 2015

In [137]:
# Skipping the first 24 rows while reading the data to ignore the metadata

climate_2015 = pd.read_csv('ONTARIO-2015.csv', skiprows=24)

climate_2015.head()
Out[137]:
Date/Time Year Month Day Data Quality Max Temp (°C) Max Temp Flag Min Temp (°C) Min Temp Flag Mean Temp (°C) Mean Temp Flag Heat Deg Days (°C) Heat Deg Days Flag Cool Deg Days (°C) Cool Deg Days Flag Total Rain (mm) Total Rain Flag Total Snow (cm) Total Snow Flag Total Precip (mm) Total Precip Flag Snow on Grnd (cm) Snow on Grnd Flag Dir of Max Gust (10s deg) Dir of Max Gust Flag Spd of Max Gust (km/h) Spd of Max Gust Flag
0 2015-01-01 2015 1 1 † 0.0 NaN -8.5 NaN -4.3 NaN 22.3 NaN 0.0 NaN 0.0 NaN 0.0 NaN 0.0 NaN 0.0 NaN NaN NaN NaN NaN
1 2015-01-02 2015 1 2 † 3.0 NaN -3.0 NaN 0.0 NaN 18.0 NaN 0.0 NaN 0.0 NaN 0.0 NaN 0.0 NaN 0.0 NaN NaN NaN NaN NaN
2 2015-01-03 2015 1 3 † 2.5 NaN -4.0 NaN -0.8 NaN 18.8 NaN 0.0 NaN 24.0 NaN 0.0 NaN 24.0 NaN 0.0 NaN NaN NaN NaN NaN
3 2015-01-04 2015 1 4 † 2.5 NaN 0.0 NaN 1.3 NaN 16.7 NaN 0.0 NaN 0.0 NaN 1.0 NaN 1.0 NaN 0.0 NaN NaN NaN NaN NaN
4 2015-01-05 2015 1 5 † -10.0 NaN -13.5 NaN -11.8 NaN 29.8 NaN 0.0 NaN 0.0 NaN 3.0 NaN 3.0 NaN 1.0 NaN NaN NaN NaN NaN
In [138]:
climate_2015.columns
Out[138]:
Index(['Date/Time', 'Year', 'Month', 'Day', 'Data Quality', 'Max Temp (°C)', 'Max Temp Flag', 'Min Temp (°C)', 'Min Temp Flag', 'Mean Temp (°C)', 'Mean Temp Flag', 'Heat Deg Days (°C)', 'Heat Deg Days Flag', 'Cool Deg Days (°C)', 'Cool Deg Days Flag', 'Total Rain (mm)', 'Total Rain Flag', 'Total Snow (cm)', 'Total Snow Flag', 'Total Precip (mm)', 'Total Precip Flag', 'Snow on Grnd (cm)', 'Snow on Grnd Flag', 'Dir of Max Gust (10s deg)', 'Dir of Max Gust Flag', 'Spd of Max Gust (km/h)', 'Spd of Max Gust Flag'], dtype='object')
In [139]:
# Dropping the columns with all nulls

climate_2015.dropna(axis=1, thresh=1, inplace=True)

climate_2015.reset_index(drop=True, inplace=True)

climate_2015.head()
Out[139]:
Date/Time Year Month Day Data Quality Max Temp (°C) Max Temp Flag Min Temp (°C) Min Temp Flag Mean Temp (°C) Mean Temp Flag Heat Deg Days (°C) Heat Deg Days Flag Cool Deg Days (°C) Cool Deg Days Flag Total Rain (mm) Total Rain Flag Total Snow (cm) Total Snow Flag Total Precip (mm) Total Precip Flag Snow on Grnd (cm) Snow on Grnd Flag
0 2015-01-01 2015 1 1 † 0.0 NaN -8.5 NaN -4.3 NaN 22.3 NaN 0.0 NaN 0.0 NaN 0.0 NaN 0.0 NaN 0.0 NaN
1 2015-01-02 2015 1 2 † 3.0 NaN -3.0 NaN 0.0 NaN 18.0 NaN 0.0 NaN 0.0 NaN 0.0 NaN 0.0 NaN 0.0 NaN
2 2015-01-03 2015 1 3 † 2.5 NaN -4.0 NaN -0.8 NaN 18.8 NaN 0.0 NaN 24.0 NaN 0.0 NaN 24.0 NaN 0.0 NaN
3 2015-01-04 2015 1 4 † 2.5 NaN 0.0 NaN 1.3 NaN 16.7 NaN 0.0 NaN 0.0 NaN 1.0 NaN 1.0 NaN 0.0 NaN
4 2015-01-05 2015 1 5 † -10.0 NaN -13.5 NaN -11.8 NaN 29.8 NaN 0.0 NaN 0.0 NaN 3.0 NaN 3.0 NaN 1.0 NaN
In [140]:
climate_2015.columns
Out[140]:
Index(['Date/Time', 'Year', 'Month', 'Day', 'Data Quality', 'Max Temp (°C)', 'Max Temp Flag', 'Min Temp (°C)', 'Min Temp Flag', 'Mean Temp (°C)', 'Mean Temp Flag', 'Heat Deg Days (°C)', 'Heat Deg Days Flag', 'Cool Deg Days (°C)', 'Cool Deg Days Flag', 'Total Rain (mm)', 'Total Rain Flag', 'Total Snow (cm)', 'Total Snow Flag', 'Total Precip (mm)', 'Total Precip Flag', 'Snow on Grnd (cm)', 'Snow on Grnd Flag'], dtype='object')
In [141]:
# Dropping columns with only one distinct value

drop_cols = []

for col in climate_2015.columns:
    if 'Flag' in col:
        print(climate_2015[col].value_counts())
        print()
        drop_cols.append(col)

drop_cols.append('Data Quality')

climate_2015.drop(drop_cols, axis=1, inplace=True)

print(climate_2015.columns)
M    13
Name: Max Temp Flag, dtype: int64

M    39
Name: Min Temp Flag, dtype: int64

M    39
Name: Mean Temp Flag, dtype: int64

M    39
Name: Heat Deg Days Flag, dtype: int64

M    39
Name: Cool Deg Days Flag, dtype: int64

T    7
Name: Total Rain Flag, dtype: int64

T    9
Name: Total Snow Flag, dtype: int64

T    12
Name: Total Precip Flag, dtype: int64

T    2
Name: Snow on Grnd Flag, dtype: int64

Index(['Date/Time', 'Year', 'Month', 'Day', 'Max Temp (°C)', 'Min Temp (°C)', 'Mean Temp (°C)', 'Heat Deg Days (°C)', 'Cool Deg Days (°C)', 'Total Rain (mm)', 'Total Snow (cm)', 'Total Precip (mm)', 'Snow on Grnd (cm)'], dtype='object')
In [142]:
# Formatting column names

climate_2015.columns = [col.replace(' (°C)','').strip() for col in climate_2015.columns]

climate_2015.columns
Out[142]:
Index(['Date/Time', 'Year', 'Month', 'Day', 'Max Temp', 'Min Temp', 'Mean Temp', 'Heat Deg Days', 'Cool Deg Days', 'Total Rain (mm)', 'Total Snow (cm)', 'Total Precip (mm)', 'Snow on Grnd (cm)'], dtype='object')
In [143]:
climate_2015.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 365 entries, 0 to 364
Data columns (total 13 columns):
Date/Time            365 non-null object
Year                 365 non-null int64
Month                365 non-null int64
Day                  365 non-null int64
Max Temp             216 non-null float64
Min Temp             190 non-null float64
Mean Temp            190 non-null float64
Heat Deg Days        190 non-null float64
Cool Deg Days        190 non-null float64
Total Rain (mm)      229 non-null float64
Total Snow (cm)      229 non-null float64
Total Precip (mm)    229 non-null float64
Snow on Grnd (cm)    229 non-null float64
dtypes: float64(9), int64(3), object(1)
memory usage: 37.1+ KB
In [144]:
# Dropping rows where all the columns have nulls

cols = ['Max Temp',
 'Min Temp',
 'Mean Temp',
 'Heat Deg Days',
 'Cool Deg Days',
 'Total Rain (mm)',
 'Total Snow (cm)',
 'Total Precip (mm)',
 'Snow on Grnd (cm)']

climate_2015.dropna(subset=cols, inplace=True)

climate_2015.reset_index(drop=True, inplace=True)

climate_2015.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 190 entries, 0 to 189
Data columns (total 13 columns):
Date/Time            190 non-null object
Year                 190 non-null int64
Month                190 non-null int64
Day                  190 non-null int64
Max Temp             190 non-null float64
Min Temp             190 non-null float64
Mean Temp            190 non-null float64
Heat Deg Days        190 non-null float64
Cool Deg Days        190 non-null float64
Total Rain (mm)      190 non-null float64
Total Snow (cm)      190 non-null float64
Total Precip (mm)    190 non-null float64
Snow on Grnd (cm)    190 non-null float64
dtypes: float64(9), int64(3), object(1)
memory usage: 19.4+ KB
In [145]:
# Extracting week # from the date field and formatting it

climate_2015['Week'] = climate_2015['Date/Time'].apply(lambda x: datetime.datetime.strptime(x, "%Y-%m-%d").strftime("%V"))

climate_2015['Week'] = climate_2015['Week'].apply(lambda x: str(x).lstrip('0'))

climate_2015.head()
Out[145]:
Date/Time Year Month Day Max Temp Min Temp Mean Temp Heat Deg Days Cool Deg Days Total Rain (mm) Total Snow (cm) Total Precip (mm) Snow on Grnd (cm) Week
0 2015-01-01 2015 1 1 0.0 -8.5 -4.3 22.3 0.0 0.0 0.0 0.0 0.0 1
1 2015-01-02 2015 1 2 3.0 -3.0 0.0 18.0 0.0 0.0 0.0 0.0 0.0 1
2 2015-01-03 2015 1 3 2.5 -4.0 -0.8 18.8 0.0 24.0 0.0 24.0 0.0 1
3 2015-01-04 2015 1 4 2.5 0.0 1.3 16.7 0.0 0.0 1.0 1.0 0.0 1
4 2015-01-05 2015 1 5 -10.0 -13.5 -11.8 29.8 0.0 0.0 3.0 3.0 1.0 2
In [146]:
# Extracting July to Dec month's data

climate_2015 = climate_2015[climate_2015['Month'] >= 7]
climate_2015.reset_index(drop=True, inplace=True)

climate_2015.head()
Out[146]:
Date/Time Year Month Day Max Temp Min Temp Mean Temp Heat Deg Days Cool Deg Days Total Rain (mm) Total Snow (cm) Total Precip (mm) Snow on Grnd (cm) Week
0 2015-07-01 2015 7 1 24.5 17.0 20.8 0.0 2.8 0.0 0.0 0.0 0.0 27
1 2015-07-02 2015 7 2 24.0 14.0 19.0 0.0 1.0 0.0 0.0 0.0 0.0 27
2 2015-07-03 2015 7 3 25.0 10.0 17.5 0.5 0.0 0.0 0.0 0.0 0.0 27
3 2015-07-04 2015 7 4 26.0 11.0 18.5 0.0 0.5 0.0 0.0 0.0 0.0 27
4 2015-07-05 2015 7 5 28.0 14.0 21.0 0.0 3.0 0.0 0.0 0.0 0.0 27
In [147]:
climate_2015.sort_values(by=['Week'], ascending = True, inplace = True)
In [148]:
# Getting weekly data

climate_2015 = climate_2015.groupby(['Week']).agg({'Max Temp':"max", 'Min Temp':"min", 'Mean Temp':"mean", 'Heat Deg Days':"mean", 'Cool Deg Days':"mean", 'Total Rain (mm)':"mean", 'Total Snow (cm)':"mean", 'Total Precip (mm)':"mean", 'Snow on Grnd (cm)':"mean"}).reset_index(drop=False)

climate_2015.head()
Out[148]:
Week Max Temp Min Temp Mean Temp Heat Deg Days Cool Deg Days Total Rain (mm) Total Snow (cm) Total Precip (mm) Snow on Grnd (cm)
0 27 28.0 10.0 19.360000 0.100000 1.460000 0.000000 0.0 0.000000 0.0
1 28 28.0 12.5 20.100000 0.283333 2.383333 4.416667 0.0 4.416667 0.0
2 29 33.0 11.0 23.183333 0.000000 5.183333 1.400000 0.0 1.400000 0.0
3 30 31.5 14.5 23.060000 0.000000 5.060000 1.080000 0.0 1.080000 0.0
4 31 33.5 16.0 24.566667 0.000000 6.566667 4.633333 0.0 4.633333 0.0
In [149]:
# Dropping 27th week data as it belongs to June 2015

climate_2015.drop(climate_2015[climate_2015['Week'] == '27'].index, inplace = True)
In [150]:
climate_2015.duplicated().any()
Out[150]:
False

No duplicates

In [151]:
climate_2015.isnull().values.any()
Out[151]:
False

No Nulls

Climate 2016

We will generate weekly data from date for the climate data of 2016

In [152]:
# Skipping the first 24 rows while reading the data to ignore the metadata

climate_2016 = pd.read_csv('ONTARIO-2016.csv', skiprows=24)

climate_2016.head()
Out[152]:
Date/Time Year Month Day Data Quality Max Temp (°C) Max Temp Flag Min Temp (°C) Min Temp Flag Mean Temp (°C) Mean Temp Flag Heat Deg Days (°C) Heat Deg Days Flag Cool Deg Days (°C) Cool Deg Days Flag Total Rain (mm) Total Rain Flag Total Snow (cm) Total Snow Flag Total Precip (mm) Total Precip Flag Snow on Grnd (cm) Snow on Grnd Flag Dir of Max Gust (10s deg) Dir of Max Gust Flag Spd of Max Gust (km/h) Spd of Max Gust Flag
0 2016-01-01 2016 1 1 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1 2016-01-02 2016 1 2 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 2016-01-03 2016 1 3 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3 2016-01-04 2016 1 4 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
4 2016-01-05 2016 1 5 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
In [153]:
climate_2016.columns
Out[153]:
Index(['Date/Time', 'Year', 'Month', 'Day', 'Data Quality', 'Max Temp (°C)', 'Max Temp Flag', 'Min Temp (°C)', 'Min Temp Flag', 'Mean Temp (°C)', 'Mean Temp Flag', 'Heat Deg Days (°C)', 'Heat Deg Days Flag', 'Cool Deg Days (°C)', 'Cool Deg Days Flag', 'Total Rain (mm)', 'Total Rain Flag', 'Total Snow (cm)', 'Total Snow Flag', 'Total Precip (mm)', 'Total Precip Flag', 'Snow on Grnd (cm)', 'Snow on Grnd Flag', 'Dir of Max Gust (10s deg)', 'Dir of Max Gust Flag', 'Spd of Max Gust (km/h)', 'Spd of Max Gust Flag'], dtype='object')
In [154]:
# Dropping the columns with all nulls

climate_2016.dropna(axis=1, thresh=1, inplace=True)

climate_2016.reset_index(drop=True, inplace=True)

climate_2016.head()
Out[154]:
Date/Time Year Month Day Data Quality Max Temp (°C) Max Temp Flag Min Temp (°C) Min Temp Flag Mean Temp (°C) Mean Temp Flag Heat Deg Days (°C) Heat Deg Days Flag Cool Deg Days (°C) Cool Deg Days Flag Total Rain (mm) Total Rain Flag Total Snow (cm) Total Snow Flag Total Precip (mm) Total Precip Flag Snow on Grnd (cm) Snow on Grnd Flag
0 2016-01-01 2016 1 1 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1 2016-01-02 2016 1 2 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 2016-01-03 2016 1 3 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3 2016-01-04 2016 1 4 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
4 2016-01-05 2016 1 5 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
In [155]:
climate_2016.columns
Out[155]:
Index(['Date/Time', 'Year', 'Month', 'Day', 'Data Quality', 'Max Temp (°C)', 'Max Temp Flag', 'Min Temp (°C)', 'Min Temp Flag', 'Mean Temp (°C)', 'Mean Temp Flag', 'Heat Deg Days (°C)', 'Heat Deg Days Flag', 'Cool Deg Days (°C)', 'Cool Deg Days Flag', 'Total Rain (mm)', 'Total Rain Flag', 'Total Snow (cm)', 'Total Snow Flag', 'Total Precip (mm)', 'Total Precip Flag', 'Snow on Grnd (cm)', 'Snow on Grnd Flag'], dtype='object')
In [156]:
# Dropping columns with only one distinct value

drop_cols = []

for col in climate_2016.columns:
    if 'Flag' in col:
        print(climate_2016[col].value_counts())
        print()
        drop_cols.append(col)

drop_cols.append('Data Quality')

climate_2016.drop(drop_cols, axis=1, inplace=True)

print(climate_2016.columns)
M    1
Name: Max Temp Flag, dtype: int64

M    8
Name: Min Temp Flag, dtype: int64

M    8
Name: Mean Temp Flag, dtype: int64

M    8
Name: Heat Deg Days Flag, dtype: int64

M    8
Name: Cool Deg Days Flag, dtype: int64

T    3
Name: Total Rain Flag, dtype: int64

T    7
Name: Total Snow Flag, dtype: int64

T    7
Name: Total Precip Flag, dtype: int64

T    6
Name: Snow on Grnd Flag, dtype: int64

Index(['Date/Time', 'Year', 'Month', 'Day', 'Max Temp (°C)', 'Min Temp (°C)', 'Mean Temp (°C)', 'Heat Deg Days (°C)', 'Cool Deg Days (°C)', 'Total Rain (mm)', 'Total Snow (cm)', 'Total Precip (mm)', 'Snow on Grnd (cm)'], dtype='object')
In [157]:
# Formatting column names

climate_2016.columns = [col.replace(' (°C)','').strip() for col in climate_2016.columns]

climate_2016.columns
Out[157]:
Index(['Date/Time', 'Year', 'Month', 'Day', 'Max Temp', 'Min Temp', 'Mean Temp', 'Heat Deg Days', 'Cool Deg Days', 'Total Rain (mm)', 'Total Snow (cm)', 'Total Precip (mm)', 'Snow on Grnd (cm)'], dtype='object')
In [158]:
climate_2016.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 366 entries, 0 to 365
Data columns (total 13 columns):
Date/Time            366 non-null object
Year                 366 non-null int64
Month                366 non-null int64
Day                  366 non-null int64
Max Temp             226 non-null float64
Min Temp             219 non-null float64
Mean Temp            219 non-null float64
Heat Deg Days        219 non-null float64
Cool Deg Days        219 non-null float64
Total Rain (mm)      227 non-null float64
Total Snow (cm)      227 non-null float64
Total Precip (mm)    227 non-null float64
Snow on Grnd (cm)    227 non-null float64
dtypes: float64(9), int64(3), object(1)
memory usage: 37.2+ KB
In [159]:
# Dropping rows where all the columns have nulls

cols = ['Max Temp',
 'Min Temp',
 'Mean Temp',
 'Heat Deg Days',
 'Cool Deg Days',
 'Total Rain (mm)',
 'Total Snow (cm)',
 'Total Precip (mm)',
 'Snow on Grnd (cm)']

climate_2016.dropna(subset=cols, inplace=True)

climate_2016.reset_index(drop=True, inplace=True)

climate_2016.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 219 entries, 0 to 218
Data columns (total 13 columns):
Date/Time            219 non-null object
Year                 219 non-null int64
Month                219 non-null int64
Day                  219 non-null int64
Max Temp             219 non-null float64
Min Temp             219 non-null float64
Mean Temp            219 non-null float64
Heat Deg Days        219 non-null float64
Cool Deg Days        219 non-null float64
Total Rain (mm)      219 non-null float64
Total Snow (cm)      219 non-null float64
Total Precip (mm)    219 non-null float64
Snow on Grnd (cm)    219 non-null float64
dtypes: float64(9), int64(3), object(1)
memory usage: 22.3+ KB
In [160]:
# Extracting week # from the date field and formatting it

climate_2016['Week'] = climate_2016['Date/Time'].apply(lambda x: datetime.datetime.strptime(x, "%Y-%m-%d").strftime("%V"))

climate_2016['Week'] = climate_2016['Week'].apply(lambda x: str(x).lstrip('0'))

climate_2016.head()
Out[160]:
Date/Time Year Month Day Max Temp Min Temp Mean Temp Heat Deg Days Cool Deg Days Total Rain (mm) Total Snow (cm) Total Precip (mm) Snow on Grnd (cm) Week
0 2016-01-08 2016 1 8 7.0 -14.0 -3.5 21.5 0.0 9.0 0.0 9.0 0.0 1
1 2016-01-09 2016 1 9 11.0 3.0 7.0 11.0 0.0 14.6 0.0 14.6 0.0 1
2 2016-01-10 2016 1 10 -1.5 -5.0 -3.3 21.3 0.0 0.0 7.0 7.0 0.0 1
3 2016-01-12 2016 1 12 -5.0 -12.0 -8.5 26.5 0.0 0.0 0.0 0.0 12.0 2
4 2016-01-14 2016 1 14 2.5 -10.5 -4.0 22.0 0.0 0.0 0.0 0.0 11.0 2
In [161]:
# Extracting Jan to June month's data

climate_2016 = climate_2016[climate_2016['Month'] <= 6]
climate_2016.reset_index(drop=True, inplace=True)

climate_2016.head()
Out[161]:
Date/Time Year Month Day Max Temp Min Temp Mean Temp Heat Deg Days Cool Deg Days Total Rain (mm) Total Snow (cm) Total Precip (mm) Snow on Grnd (cm) Week
0 2016-01-08 2016 1 8 7.0 -14.0 -3.5 21.5 0.0 9.0 0.0 9.0 0.0 1
1 2016-01-09 2016 1 9 11.0 3.0 7.0 11.0 0.0 14.6 0.0 14.6 0.0 1
2 2016-01-10 2016 1 10 -1.5 -5.0 -3.3 21.3 0.0 0.0 7.0 7.0 0.0 1
3 2016-01-12 2016 1 12 -5.0 -12.0 -8.5 26.5 0.0 0.0 0.0 0.0 12.0 2
4 2016-01-14 2016 1 14 2.5 -10.5 -4.0 22.0 0.0 0.0 0.0 0.0 11.0 2
In [162]:
climate_2016.tail()
Out[162]:
Date/Time Year Month Day Max Temp Min Temp Mean Temp Heat Deg Days Cool Deg Days Total Rain (mm) Total Snow (cm) Total Precip (mm) Snow on Grnd (cm) Week
88 2016-06-24 2016 6 24 29.0 13.0 21.0 0.0 3.0 0.0 0.0 0.0 0.0 25
89 2016-06-25 2016 6 25 30.5 13.0 21.8 0.0 3.8 0.0 0.0 0.0 0.0 25
90 2016-06-26 2016 6 26 34.0 18.5 26.3 0.0 8.3 3.2 0.0 3.2 0.0 25
91 2016-06-27 2016 6 27 34.5 20.0 27.3 0.0 9.3 0.0 0.0 0.0 0.0 26
92 2016-06-30 2016 6 30 30.0 13.0 21.5 0.0 3.5 2.0 0.0 2.0 0.0 26
In [163]:
climate_2016.sort_values(by=['Week'], ascending = True, inplace = True)
In [164]:
# Getting weekly data

climate_2016 = climate_2016.groupby(['Week']).agg({'Max Temp':"max", 'Min Temp':"min", 'Mean Temp':"mean", 'Heat Deg Days':"mean", 'Cool Deg Days':"mean", 'Total Rain (mm)':"mean", 'Total Snow (cm)':"mean", 'Total Precip (mm)':"mean", 'Snow on Grnd (cm)':"mean"}).reset_index(drop=False)

climate_2016.head()
Out[164]:
Week Max Temp Min Temp Mean Temp Heat Deg Days Cool Deg Days Total Rain (mm) Total Snow (cm) Total Precip (mm) Snow on Grnd (cm)
0 1 11.0 -14.0 0.066667 17.933333 0.0 7.866667 2.333333 10.200000 0.0
1 10 20.0 -2.0 10.166667 7.833333 0.0 9.000000 0.000000 9.000000 0.0
2 11 16.0 -2.5 8.900000 9.100000 0.0 0.500000 0.000000 0.500000 0.0
3 12 20.5 -3.5 5.720000 12.280000 0.0 12.800000 0.000000 12.800000 0.0
4 13 16.0 -5.0 6.871429 11.128571 0.0 2.828571 0.542857 3.371429 0.0
In [165]:
climate_2016.duplicated().any()
Out[165]:
False

No duplicates

In [166]:
climate_2016.isnull().values.any()
Out[166]:
False

No Nulls

List Difference

In [167]:
def list_diff(list1, list2): 
    return (list(set(list1) - set(list2)))

Concatenating Climate Datasets

In [168]:
print(list_diff(list(climate_2015.columns), list(climate_2016.columns)))
[]
In [169]:
print(climate_2015.shape)
print(climate_2016.shape)
(26, 10)
(25, 10)

Concatenating the climate data sets to form a single one to be merged with the order data set

In [170]:
climate = pd.concat([climate_2015, climate_2016], axis = 0)

climate.shape
Out[170]:
(51, 10)
In [171]:
climate.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 51 entries, 1 to 24
Data columns (total 10 columns):
Week                 51 non-null object
Max Temp             51 non-null float64
Min Temp             51 non-null float64
Mean Temp            51 non-null float64
Heat Deg Days        51 non-null float64
Cool Deg Days        51 non-null float64
Total Rain (mm)      51 non-null float64
Total Snow (cm)      51 non-null float64
Total Precip (mm)    51 non-null float64
Snow on Grnd (cm)    51 non-null float64
dtypes: float64(9), object(1)
memory usage: 4.4+ KB
In [172]:
climate.head()
Out[172]:
Week Max Temp Min Temp Mean Temp Heat Deg Days Cool Deg Days Total Rain (mm) Total Snow (cm) Total Precip (mm) Snow on Grnd (cm)
1 28 28.0 12.5 20.100000 0.283333 2.383333 4.416667 0.0 4.416667 0.0
2 29 33.0 11.0 23.183333 0.000000 5.183333 1.400000 0.0 1.400000 0.0
3 30 31.5 14.5 23.060000 0.000000 5.060000 1.080000 0.0 1.080000 0.0
4 31 33.5 16.0 24.566667 0.000000 6.566667 4.633333 0.0 4.633333 0.0
5 32 28.5 15.0 21.650000 0.000000 3.650000 0.350000 0.0 0.350000 0.0
In [173]:
climate.Week.unique()
Out[173]:
array(['28', '29', '30', '31', '32', '33', '34', '35', '36', '37', '38',
       '39', '40', '41', '42', '43', '44', '45', '46', '47', '48', '49',
       '50', '51', '52', '53', '1', '10', '11', '12', '13', '14', '15',
       '16', '17', '18', '19', '2', '20', '21', '22', '23', '24', '25',
       '26', '4', '5', '6', '7', '8', '9'], dtype=object)

Extracting 3 separate dataframes for 3 product subcategories - camera accessory, home audio and gaming accessory

In [174]:
main_df.shape
Out[174]:
(1464190, 21)
In [175]:
main_df['product_analytic_sub_category'].value_counts()
Out[175]:
speaker                468310
cameraaccessory        215901
gamingaccessory        185876
tvvideosmall           133192
homeaudio              111061
audiomp3player         103463
game                    96842
camera                  87552
gamingconsole           26984
camerastorage           16558
audioaccessory          10702
hometheatre              4197
amplifierreceiver        3455
gamemembershipcards        97
Name: product_analytic_sub_category, dtype: int64
In [176]:
cameraaccessory_df = main_df.loc[main_df['product_analytic_sub_category']=='cameraaccessory']
gamingaccessory_df = main_df.loc[main_df['product_analytic_sub_category']=='gamingaccessory']
homeaudio_df = main_df.loc[main_df['product_analytic_sub_category']=='homeaudio']

print('No of rows in cameraaccessory_df: {}'.format(cameraaccessory_df.shape[0]))
print('No of rows in gamingaccessory_df: {}'.format(gamingaccessory_df.shape[0]))
print('No of rows in homeaudio_df: {}'.format(homeaudio_df.shape[0]))
No of rows in cameraaccessory_df: 215901
No of rows in gamingaccessory_df: 185876
No of rows in homeaudio_df: 111061

Converting some binary variables to numeric format

In [177]:
cameraaccessory_df['is_cod'] = cameraaccessory_df['s1_fact.order_payment_type'].apply(lambda x:1 if x=='cod' else 0)
gamingaccessory_df['is_cod'] = gamingaccessory_df['s1_fact.order_payment_type'].apply(lambda x:1 if x=='cod' else 0)
homeaudio_df['is_cod'] = homeaudio_df['s1_fact.order_payment_type'].apply(lambda x:1 if x=='cod' else 0)


cameraaccessory_df['is_mass_market'] = cameraaccessory_df['product_type'].apply(lambda x:1 if x=='mass_market' else 0)
gamingaccessory_df['is_mass_market'] = gamingaccessory_df['product_type'].apply(lambda x:1 if x=='mass_market' else 0)
homeaudio_df['is_mass_market'] = homeaudio_df['product_type'].apply(lambda x:1 if x=='mass_market' else 0)

cameraaccessory_df.head()
Out[177]:
order_date Year Month Week gmv list_price Discount% units deliverybdays deliverycdays s1_fact.order_payment_type sla pincode product_analytic_category product_analytic_sub_category product_analytic_vertical product_mrp product_procurement_sla payday_flag occassion_flag product_type is_cod is_mass_market
0 2015-10-17 15:11:54 2015 10 42 6400.0 6400.0 10.99 1 0.0 0.0 cod 5 -7.79175582905735E+018 cameraaccessory cameraaccessory cameratripod 7190.0 0 0 0 luxury 1 0
1 2015-10-19 10:07:22 2015 10 43 6900.0 6900.0 4.03 1 0.0 0.0 cod 7 7.33541149097431E+018 cameraaccessory cameraaccessory cameratripod 7190.0 0 0 0 luxury 1 0
2 2015-10-20 15:45:56 2015 10 43 1990.0 1990.0 5.19 1 0.0 0.0 cod 10 -7.47768776228657E+018 cameraaccessory cameraaccessory cameratripod 2099.0 3 0 0 mass_market 1 1
3 2015-10-14 12:05:15 2015 10 42 1690.0 1690.0 19.49 1 0.0 0.0 prepaid 4 -5.83593163877661E+018 cameraaccessory cameraaccessory cameratripod 2099.0 3 1 0 mass_market 0 1
4 2015-10-17 21:25:03 2015 10 42 1618.0 1618.0 22.92 1 0.0 0.0 prepaid 6 5.34735360997242E+017 cameraaccessory cameraaccessory cameratripod 2099.0 3 0 0 mass_market 0 1

Deleting the columns 's1_fact.order_payment_type' & 'product_type'

In [178]:
# Dropping Columns which are insignificant to the analysis

drop_columns = ['s1_fact.order_payment_type','product_type']

cameraaccessory_df.drop(drop_columns, axis=1, inplace=True)
gamingaccessory_df.drop(drop_columns, axis=1, inplace=True)
homeaudio_df.drop(drop_columns, axis=1, inplace=True)

For categorical variable(product_analytic_vertical) with multiple levels, creating dummy features (one-hot encoded)

In [179]:
# Creating dummy variables for the remaining categorical variable
dummy1 = pd.get_dummies(cameraaccessory_df[['product_analytic_vertical']], prefix='product_vertical', drop_first=True)
dummy2 = pd.get_dummies(gamingaccessory_df[['product_analytic_vertical']], prefix='product_vertical', drop_first=True)
dummy3 = pd.get_dummies(homeaudio_df[['product_analytic_vertical']], prefix='product_vertical', drop_first=True)

# Adding the results to the original dataframes
cameraaccessory_df = pd.concat([cameraaccessory_df, dummy1], axis=1)
gamingaccessory_df = pd.concat([gamingaccessory_df, dummy2], axis=1)
homeaudio_df = pd.concat([homeaudio_df, dummy3], axis=1)

Dropping the repeated variable

In [180]:
# removing columns
cameraaccessory_df = cameraaccessory_df.drop('product_analytic_vertical', axis=1)
gamingaccessory_df = gamingaccessory_df.drop('product_analytic_vertical', axis=1)
homeaudio_df = homeaudio_df.drop('product_analytic_vertical', axis=1)

print('Shape of cameraaccessory_df: {}'.format(cameraaccessory_df.shape))
print('Shape of gamingaccessory_df: {}'.format(gamingaccessory_df.shape))
print('Shape of homeaudio_df: {}'.format(homeaudio_df.shape))
Shape of cameraaccessory_df: (215901, 43)
Shape of gamingaccessory_df: (185876, 34)
Shape of homeaudio_df: (111061, 30)

Checking the no of unique values in the columns of the 3 new dataframes

In [181]:
# Unique value frequencies
unique_values = pd.DataFrame(cameraaccessory_df.apply(lambda x: len(x.value_counts(dropna=False)), axis=0), \
                             columns=['Unique Value Count']).sort_values(by='Unique Value Count', ascending=True)
unique_values['dtype'] = pd.DataFrame(cameraaccessory_df.dtypes)
unique_values.head()
Out[181]:
Unique Value Count dtype
product_analytic_sub_category 1 object
product_analytic_category 1 object
product_vertical_camerabag 2 uint8
product_vertical_camerabatterygrip 2 uint8
product_vertical_cameraeyecup 2 uint8
In [182]:
# Unique value frequencies
unique_values = pd.DataFrame(gamingaccessory_df.apply(lambda x: len(x.value_counts(dropna=False)), axis=0), \
                             columns=['Unique Value Count']).sort_values(by='Unique Value Count', ascending=True)
unique_values['dtype'] = pd.DataFrame(gamingaccessory_df.dtypes)
unique_values.head()
Out[182]:
Unique Value Count dtype
product_analytic_sub_category 1 object
product_analytic_category 1 object
payday_flag 2 int64
product_vertical_gamecontrolmount 2 uint8
product_vertical_gamepad 2 uint8
In [183]:
# Unique value frequencies
unique_values = pd.DataFrame(homeaudio_df.apply(lambda x: len(x.value_counts(dropna=False)), axis=0), \
                             columns=['Unique Value Count']).sort_values(by='Unique Value Count', ascending=True)
unique_values['dtype'] = pd.DataFrame(homeaudio_df.dtypes)
unique_values.head()
Out[183]:
Unique Value Count dtype
product_analytic_sub_category 1 object
product_analytic_category 1 object
product_vertical_voicerecorder 2 uint8
product_vertical_dock 2 uint8
product_vertical_djcontroller 2 uint8

Thus we see that in all 3 dfs, columns 'product_analytic_category' & 'product_analytic_sub_category' have only 1 unique value. Hence we will drop these 2 columns from the 3 dataframes.

In [184]:
# Dropping Columns which are insignificant to the analysis

drop_columns = ['product_analytic_category', 'product_analytic_sub_category']

cameraaccessory_df.drop(drop_columns, axis=1, inplace=True)
gamingaccessory_df.drop(drop_columns, axis=1, inplace=True)
homeaudio_df.drop(drop_columns, axis=1, inplace=True)

Checking if any null values exist in the 3 new dataframes

In [185]:
print('Are there any null values in cameraaccessory_df: {}'.format(cameraaccessory_df.isnull().values.any()))
print('Are there any null values in gamingaccessory_df: {}'.format(gamingaccessory_df.isnull().values.any()))
print('Are there any null values in homeaudio_df: {}'.format(homeaudio_df.isnull().values.any()))
Are there any null values in cameraaccessory_df: False
Are there any null values in gamingaccessory_df: False
Are there any null values in homeaudio_df: False

Roll Up Order Data to Weekly Level

Rolling up order data to get weekly data by taking a mean of all the values per week

Units also is not a very important feature from business perspective as its obvious that higher the no of units sold, more will be the revenue. The challenge of prediction is to find attributes, tuning which, customers will buy more product units, thereby increasing revenue.
In [186]:
cameraaccessory_df = cameraaccessory_df.groupby(['Week']).agg({'gmv':"sum",'Discount%':'mean','deliverybdays': "mean", \
                                          'deliverycdays':'mean','sla':'mean', 'product_procurement_sla':'mean', \
                                          'payday_flag':'sum','occassion_flag':'sum','is_cod':'sum', 'is_mass_market':'sum', \
                                          'product_vertical_cameraaccessory':'sum', \
                                          'product_vertical_camerabag':'sum', 'product_vertical_camerabattery':'sum', \
                                          'product_vertical_camerabatterycharger':'sum', 'product_vertical_camerabatterygrip': \
                                          'sum','product_vertical_cameraeyecup':'sum','product_vertical_camerafilmrolls':'sum', \
                                          'product_vertical_camerahousing':'sum','product_vertical_cameraledlight':'sum', \
                                          'product_vertical_cameramicrophone':'sum','product_vertical_cameramount':'sum', \
                                          'product_vertical_cameraremotecontrol':'sum', 'product_vertical_cameratripod':'sum', \
                                          'product_vertical_extensiontube':'sum', 'product_vertical_filter':'sum', \
                                          'product_vertical_flash':'sum','product_vertical_flashshoeadapter':'sum', \
                                          'product_vertical_lens':'sum','product_vertical_reflectorumbrella':'sum', \
                                          'product_vertical_softbox':'sum','product_vertical_strap':'sum', \
                                          'product_vertical_teleconverter':'sum','product_vertical_telescope':'sum'}) \
.reset_index(drop=False)
cameraaccessory_df.shape
Out[186]:
(51, 34)
In [187]:
gamingaccessory_df = gamingaccessory_df.groupby(['Week']).agg({'gmv':"sum",'Discount%':'mean','deliverybdays':"mean", \
                                                               'deliverycdays':'mean','sla':'mean','product_procurement_sla': \
                                                               'mean','payday_flag':'sum','occassion_flag':'sum','is_cod':'sum', \
                                                               'is_mass_market':'sum','product_vertical_gamecontrolmount':'sum', \
                                                               'product_vertical_gamepad':'sum', \
                                                               'product_vertical_gamingaccessorykit':'sum', \
                                                               'product_vertical_gamingadapter':'sum', \
                                                               'product_vertical_gamingchargingstation':'sum', \
                                                               'product_vertical_gamingheadset':'sum', \
                                                               'product_vertical_gamingkeyboard':'sum', \
                                                               'product_vertical_gamingmemorycard':'sum', \
                                                               'product_vertical_gamingmouse':'sum', \
                                                               'product_vertical_gamingmousepad':'sum', \
                                                               'product_vertical_gamingmousepad':'sum', \
                                                               'product_vertical_gamingspeaker':'sum', 
                                                               'product_vertical_joystickgamingwheel':'sum', \
                                                               'product_vertical_motioncontroller':'sum', \
                                                               'product_vertical_tvoutcableaccessory':'sum'}) \
.reset_index(drop=False)
gamingaccessory_df.shape
Out[187]:
(52, 25)
In [188]:
homeaudio_df = homeaudio_df.groupby(['Week']).agg({'gmv':"sum",'Discount%':'mean','deliverybdays': \
                                                   'mean','deliverycdays':'mean','sla':'mean','product_procurement_sla':'mean',\
                                                   'payday_flag':'sum','occassion_flag':'sum','is_cod':'sum', \
                                                   'is_mass_market':'sum','product_vertical_djcontroller':'sum', \
                                                   'product_vertical_dock':'sum', 'product_vertical_dockingstation':'sum', \
                                                   'product_vertical_fmradio':'sum', 'product_vertical_hifisystem':'sum', \
                                                   'product_vertical_homeaudiospeaker':'sum', \
                                                   'product_vertical_karaokeplayer':'sum', 'product_vertical_slingbox':'sum', \
                                                   'product_vertical_soundmixer':'sum','product_vertical_voicerecorder':'sum'})\
.reset_index(drop=False)
homeaudio_df.shape
Out[188]:
(49, 21)

Calculating Payday week & Holiday week

  • If Payday falls within the week(payday_flag > 0), then payday week = 1, else 0
  • If holiday falls within the week(occassion_flag > 0), then holiday week = 1, else 0
In [189]:
cameraaccessory_df['payday_week'] = cameraaccessory_df['payday_flag'].apply(lambda x:1 if x > 0 else 0)
gamingaccessory_df['payday_week'] = gamingaccessory_df['payday_flag'].apply(lambda x:1 if x > 0 else 0)
homeaudio_df['payday_week'] = homeaudio_df['payday_flag'].apply(lambda x:1 if x > 0 else 0)

cameraaccessory_df['holiday_week'] = cameraaccessory_df['occassion_flag'].apply(lambda x:1 if x > 0 else 0)
gamingaccessory_df['holiday_week'] = gamingaccessory_df['occassion_flag'].apply(lambda x:1 if x > 0 else 0)
homeaudio_df['holiday_week'] = homeaudio_df['occassion_flag'].apply(lambda x:1 if x > 0 else 0)

cameraaccessory_df.head()
Out[189]:
Week gmv Discount% deliverybdays deliverycdays sla product_procurement_sla payday_flag occassion_flag is_cod is_mass_market product_vertical_cameraaccessory product_vertical_camerabag product_vertical_camerabattery product_vertical_camerabatterycharger product_vertical_camerabatterygrip product_vertical_cameraeyecup product_vertical_camerafilmrolls product_vertical_camerahousing product_vertical_cameraledlight product_vertical_cameramicrophone product_vertical_cameramount product_vertical_cameraremotecontrol product_vertical_cameratripod product_vertical_extensiontube product_vertical_filter product_vertical_flash product_vertical_flashshoeadapter product_vertical_lens product_vertical_reflectorumbrella product_vertical_softbox product_vertical_strap product_vertical_teleconverter product_vertical_telescope payday_week holiday_week
0 01 5077831.0 49.228778 0.000000 0.000000 5.894657 2.881394 0 0 4272 4951 48 280.0 580.0 261.0 3 3 31 0 1 0 30 88.0 776.0 2 211.0 2022.0 0 634.0 0 0 26 0 31 0 0
1 02 4372114.0 48.386792 0.002616 0.002878 6.570382 2.663004 1474 0 3000 3515 48 264.0 468.0 230.0 7 2 40 0 0 0 21 76.0 579.0 2 218.0 1097.0 0 509.0 0 0 36 0 27 1 0
2 03 6954520.0 47.812018 0.000780 0.000975 6.384016 2.695712 0 0 3819 4558 44 295.0 607.0 303.0 13 1 46 0 0 2 34 86.0 431.0 3 268.0 1863.0 0 823.0 0 0 40 0 37 0 0
3 04 5056909.0 43.810967 0.000000 0.000000 6.664318 2.648816 1054 0 2795 3149 53 324.0 622.0 225.0 10 0 30 0 1 0 33 65.0 374.0 4 207.0 631.0 0 624.0 0 0 32 0 38 1 0
4 05 5380254.0 51.190178 0.002205 0.002940 6.340805 2.728826 0 0 4323 5100 47 342.0 941.0 251.0 5 1 37 0 0 0 50 77.0 564.0 4 250.0 1901.0 0 591.0 0 0 37 0 43 0 0

Dropping columns 'payday_flag' and 'occassion_flag'

In [190]:
drop_columns = ['payday_flag','occassion_flag']

cameraaccessory_df = cameraaccessory_df.drop(drop_columns, axis=1)
gamingaccessory_df = gamingaccessory_df.drop(drop_columns, axis=1)
homeaudio_df = homeaudio_df.drop(drop_columns, axis=1)

cameraaccessory_df.head()
Out[190]:
Week gmv Discount% deliverybdays deliverycdays sla product_procurement_sla is_cod is_mass_market product_vertical_cameraaccessory product_vertical_camerabag product_vertical_camerabattery product_vertical_camerabatterycharger product_vertical_camerabatterygrip product_vertical_cameraeyecup product_vertical_camerafilmrolls product_vertical_camerahousing product_vertical_cameraledlight product_vertical_cameramicrophone product_vertical_cameramount product_vertical_cameraremotecontrol product_vertical_cameratripod product_vertical_extensiontube product_vertical_filter product_vertical_flash product_vertical_flashshoeadapter product_vertical_lens product_vertical_reflectorumbrella product_vertical_softbox product_vertical_strap product_vertical_teleconverter product_vertical_telescope payday_week holiday_week
0 01 5077831.0 49.228778 0.000000 0.000000 5.894657 2.881394 4272 4951 48 280.0 580.0 261.0 3 3 31 0 1 0 30 88.0 776.0 2 211.0 2022.0 0 634.0 0 0 26 0 31 0 0
1 02 4372114.0 48.386792 0.002616 0.002878 6.570382 2.663004 3000 3515 48 264.0 468.0 230.0 7 2 40 0 0 0 21 76.0 579.0 2 218.0 1097.0 0 509.0 0 0 36 0 27 1 0
2 03 6954520.0 47.812018 0.000780 0.000975 6.384016 2.695712 3819 4558 44 295.0 607.0 303.0 13 1 46 0 0 2 34 86.0 431.0 3 268.0 1863.0 0 823.0 0 0 40 0 37 0 0
3 04 5056909.0 43.810967 0.000000 0.000000 6.664318 2.648816 2795 3149 53 324.0 622.0 225.0 10 0 30 0 1 0 33 65.0 374.0 4 207.0 631.0 0 624.0 0 0 32 0 38 1 0
4 05 5380254.0 51.190178 0.002205 0.002940 6.340805 2.728826 4323 5100 47 342.0 941.0 251.0 5 1 37 0 0 0 50 77.0 564.0 4 250.0 1901.0 0 591.0 0 0 37 0 43 0 0
In [191]:
print('Shape of cameraaccessory_df: {}'.format(cameraaccessory_df.shape))
print('Shape of gamingaccessory_df: {}'.format(gamingaccessory_df.shape))
print('Shape of homeaudio_df: {}'.format(homeaudio_df.shape))
Shape of cameraaccessory_df: (51, 34)
Shape of gamingaccessory_df: (52, 25)
Shape of homeaudio_df: (49, 21)

Merge DataFrames

Pre-merge activity

In [192]:
print(cameraaccessory_df.Week.unique())
print("--------------------------------------------------------")
print(gamingaccessory_df.Week.unique())
print("--------------------------------------------------------")
print(homeaudio_df.Week.unique())
print("--------------------------------------------------------")
['01' '02' '03' '04' '05' '06' '07' '08' '09' '10' '11' '12' '13' '14'
 '15' '16' '17' '18' '19' '20' '21' '22' '23' '24' '25' '26' '28' '29'
 '30' '31' '32' '33' '35' '36' '37' '38' '39' '40' '41' '42' '43' '44'
 '45' '46' '47' '48' '49' '50' '51' '52' '53']
--------------------------------------------------------
['01' '02' '03' '04' '05' '06' '07' '08' '09' '10' '11' '12' '13' '14'
 '15' '16' '17' '18' '19' '20' '21' '22' '23' '24' '25' '26' '28' '29'
 '30' '31' '32' '33' '34' '35' '36' '37' '38' '39' '40' '41' '42' '43'
 '44' '45' '46' '47' '48' '49' '50' '51' '52' '53']
--------------------------------------------------------
['01' '02' '03' '04' '05' '06' '07' '08' '09' '10' '11' '12' '13' '14'
 '15' '16' '17' '18' '19' '20' '21' '22' '23' '24' '25' '26' '28' '29'
 '30' '31' '32' '36' '37' '38' '39' '40' '41' '42' '43' '44' '45' '46'
 '47' '48' '49' '50' '51' '52' '53']
--------------------------------------------------------
In [193]:
# Formatting pre merge
cameraaccessory_df.Week = cameraaccessory_df.Week.apply(lambda x: x.lstrip('0'))
gamingaccessory_df.Week = gamingaccessory_df.Week.apply(lambda x: x.lstrip('0'))
homeaudio_df.Week = homeaudio_df.Week.apply(lambda x: x.lstrip('0'))

Merging Media Investment Dataset

In [194]:
# Taking an left join on Week column

cameraaccessory_df = pd.merge(cameraaccessory_df, media_investment, how='left', on='Week')
gamingaccessory_df = pd.merge(gamingaccessory_df, media_investment, how='left', on='Week')
homeaudio_df = pd.merge(homeaudio_df, media_investment, how='left', on='Week')

print('Are there any null values in cameraaccessory_df: {}'.format(cameraaccessory_df.isnull().values.any()))
print('Are there any null values in gamingaccessory_df: {}'.format(gamingaccessory_df.isnull().values.any()))
print('Are there any null values in homeaudio_df: {}'.format(homeaudio_df.isnull().values.any()))
Are there any null values in cameraaccessory_df: False
Are there any null values in gamingaccessory_df: False
Are there any null values in homeaudio_df: False
In [195]:
print('Shape of cameraaccessory_df: {}'.format(cameraaccessory_df.shape))
print('Shape of gamingaccessory_df: {}'.format(gamingaccessory_df.shape))
print('Shape of homeaudio_df: {}'.format(homeaudio_df.shape))
Shape of cameraaccessory_df: (51, 84)
Shape of gamingaccessory_df: (52, 75)
Shape of homeaudio_df: (49, 71)

Merging Net Promoter Score Data Set

In [196]:
# Taking an left join on Week column

cameraaccessory_df = pd.merge(cameraaccessory_df, net_promoter_score, how='left', on='Week')
gamingaccessory_df = pd.merge(gamingaccessory_df, net_promoter_score, how='left', on='Week')
homeaudio_df = pd.merge(homeaudio_df, net_promoter_score, how='left', on='Week')

print('Are there any null values in cameraaccessory_df: {}'.format(cameraaccessory_df.isnull().values.any()))
print('Are there any null values in gamingaccessory_df: {}'.format(gamingaccessory_df.isnull().values.any()))
print('Are there any null values in homeaudio_df: {}'.format(homeaudio_df.isnull().values.any()))
Are there any null values in cameraaccessory_df: False
Are there any null values in gamingaccessory_df: False
Are there any null values in homeaudio_df: False
In [197]:
print('Shape of cameraaccessory_df: {}'.format(cameraaccessory_df.shape))
print('Shape of gamingaccessory_df: {}'.format(gamingaccessory_df.shape))
print('Shape of homeaudio_df: {}'.format(homeaudio_df.shape))
Shape of cameraaccessory_df: (51, 90)
Shape of gamingaccessory_df: (52, 81)
Shape of homeaudio_df: (49, 77)

Merging Climate Data Set

In [198]:
# Taking an left join on Week column

cameraaccessory_df = pd.merge(cameraaccessory_df, climate, how='left', on='Week')
gamingaccessory_df = pd.merge(gamingaccessory_df, climate, how='left', on='Week')
homeaudio_df = pd.merge(homeaudio_df, climate, how='left', on='Week')

print('Are there any null values in cameraaccessory_df: {}'.format(cameraaccessory_df.isnull().values.any()))
print('Are there any null values in gamingaccessory_df: {}'.format(gamingaccessory_df.isnull().values.any()))
print('Are there any null values in homeaudio_df: {}'.format(homeaudio_df.isnull().values.any()))
Are there any null values in cameraaccessory_df: True
Are there any null values in gamingaccessory_df: True
Are there any null values in homeaudio_df: True

As expected, there are nulls in the climate columns now for the week# 3 which doesn't exist in the climate dataset

In [199]:
print('No of rows with null values in cameraaccessory_df:{}'.format(cameraaccessory_df[cameraaccessory_df.isnull().any(axis=1)].shape))
print('No of rows with null values in gamingaccessory_df:{}'.format(gamingaccessory_df[gamingaccessory_df.isnull().any(axis=1)].shape))
print('No of rows with null values in homeaudio_df:{}'.format(homeaudio_df[homeaudio_df.isnull().any(axis=1)].shape))
No of rows with null values in cameraaccessory_df:(1, 99)
No of rows with null values in gamingaccessory_df:(1, 90)
No of rows with null values in homeaudio_df:(1, 86)

Since the number of rows with nulls is only 1, we will drop them

In [200]:
cols = ['Max Temp', 'Min Temp', 'Mean Temp', 'Heat Deg Days', 'Cool Deg Days', 'Total Rain (mm)', 'Total Snow (cm)', 'Total Precip (mm)', 'Snow on Grnd (cm)']

cameraaccessory_df.dropna(subset=cols, inplace=True)
gamingaccessory_df.dropna(subset=cols, inplace=True)
homeaudio_df.dropna(subset=cols, inplace=True)
In [201]:
print('Are there any null values in cameraaccessory_df: {}'.format(cameraaccessory_df.isnull().values.any()))
print('Are there any null values in gamingaccessory_df: {}'.format(gamingaccessory_df.isnull().values.any()))
print('Are there any null values in homeaudio_df: {}'.format(homeaudio_df.isnull().values.any()))
Are there any null values in cameraaccessory_df: False
Are there any null values in gamingaccessory_df: False
Are there any null values in homeaudio_df: False

No Nulls

In [202]:
print('Shape of cameraaccessory_df: {}'.format(cameraaccessory_df.shape))
print('Shape of gamingaccessory_df: {}'.format(gamingaccessory_df.shape))
print('Shape of homeaudio_df: {}'.format(homeaudio_df.shape))
Shape of cameraaccessory_df: (50, 99)
Shape of gamingaccessory_df: (51, 90)
Shape of homeaudio_df: (48, 86)

Merging Sale Calendar Data Set

In [203]:
# Taking an left join on Week column

cameraaccessory_df = pd.merge(cameraaccessory_df, sale_calendar, how='left', on='Week')
gamingaccessory_df = pd.merge(gamingaccessory_df, sale_calendar, how='left', on='Week')
homeaudio_df = pd.merge(homeaudio_df, sale_calendar, how='left', on='Week')

print('Are there any null values in cameraaccessory_df: {}'.format(cameraaccessory_df.isnull().values.any()))
print('Are there any null values in gamingaccessory_df: {}'.format(gamingaccessory_df.isnull().values.any()))
print('Are there any null values in homeaudio_df: {}'.format(homeaudio_df.isnull().values.any()))
Are there any null values in cameraaccessory_df: True
Are there any null values in gamingaccessory_df: True
Are there any null values in homeaudio_df: True
In [204]:
# Imputing the nulls with 0 meaning for those weeks there was 0 days in special sale

cameraaccessory_df['Sale'].fillna(value=0, inplace=True)
gamingaccessory_df['Sale'].fillna(value=0, inplace=True)
homeaudio_df['Sale'].fillna(value=0, inplace=True)
In [205]:
print('Are there any null values in cameraaccessory_df: {}'.format(cameraaccessory_df.isnull().values.any()))
print('Are there any null values in gamingaccessory_df: {}'.format(gamingaccessory_df.isnull().values.any()))
print('Are there any null values in homeaudio_df: {}'.format(homeaudio_df.isnull().values.any()))
Are there any null values in cameraaccessory_df: False
Are there any null values in gamingaccessory_df: False
Are there any null values in homeaudio_df: False

No Nulls

In [206]:
print('Shape of cameraaccessory_df: {}'.format(cameraaccessory_df.shape))
print('Shape of gamingaccessory_df: {}'.format(gamingaccessory_df.shape))
print('Shape of homeaudio_df: {}'.format(homeaudio_df.shape))
Shape of cameraaccessory_df: (50, 100)
Shape of gamingaccessory_df: (51, 91)
Shape of homeaudio_df: (48, 87)
In [207]:
cameraaccessory_df.head()
Out[207]:
Week gmv Discount% deliverybdays deliverycdays sla product_procurement_sla is_cod is_mass_market product_vertical_cameraaccessory product_vertical_camerabag product_vertical_camerabattery product_vertical_camerabatterycharger product_vertical_camerabatterygrip product_vertical_cameraeyecup product_vertical_camerafilmrolls product_vertical_camerahousing product_vertical_cameraledlight product_vertical_cameramicrophone product_vertical_cameramount product_vertical_cameraremotecontrol product_vertical_cameratripod product_vertical_extensiontube product_vertical_filter product_vertical_flash product_vertical_flashshoeadapter product_vertical_lens product_vertical_reflectorumbrella product_vertical_softbox product_vertical_strap product_vertical_teleconverter product_vertical_telescope payday_week holiday_week Total Investment Total Investment_SMA_3 Total Investment_SMA_5 Total Investment_EMA_8 Total_Investment_Ad_Stock TV TV_SMA_3 TV_SMA_5 TV_EMA_8 TV_Ad_Stock Digital Digital_SMA_3 Digital_SMA_5 Digital_EMA_8 Digital_Ad_Stock Sponsorship Sponsorship_SMA_3 Sponsorship_SMA_5 Sponsorship_EMA_8 Sponsorship_Ad_Stock Content Marketing Content Marketing_SMA_3 Content Marketing_SMA_5 Content Marketing_EMA_8 Content_Marketing_Ad_Stock Online marketing Online marketing_SMA_3 Online marketing_SMA_5 Online marketing_EMA_8 Online_marketing_Ad_Stock Affiliates Affiliates_SMA_3 Affiliates_SMA_5 Affiliates_EMA_8 Affiliates_Ad_Stock SEM SEM_SMA_3 SEM_SMA_5 SEM_EMA_8 SEM_Ad_Stock Radio Radio_SMA_3 Radio_SMA_5 Radio_EMA_8 Radio_Ad_Stock Other Other_SMA_3 Other_SMA_5 Other_EMA_8 Other_Ad_Stock NPS NPS_SMA_3 NPS_SMA_5 Stock Index Stock Index_SMA_3 Stock Index_SMA_5 Max Temp Min Temp Mean Temp Heat Deg Days Cool Deg Days Total Rain (mm) Total Snow (cm) Total Precip (mm) Snow on Grnd (cm) Sale
0 1 5077831.0 49.228778 0.000000 0.000000 5.894657 2.881394 4272 4951 48 280.0 580.0 261.0 3 3 31 0 1 0 30 88.0 776.0 2 211.0 2022.0 0 634.0 0 0 26 0 31 0 0 18.549 23.973667 25.0586 22.026271 55.827691 1.095 1.264333 1.2982 1.177750 3.029435 0.114 0.548667 0.6356 0.627434 1.201981 1.050 9.800667 11.5508 9.191035 20.371640 0.225 0.253000 0.2586 0.235696 0.591280 5.725 5.659000 5.6458 5.234402 13.858825 1.842 1.752 1.7340 1.633464 4.332222 1.050 2.213333 2.446 2.271204 4.994131 0.675 0.225 0.135 0.150000 0.675000 6.773 2.257667 1.3546 1.505111 6.773000 47.093031 46.231010 46.058606 1052.0 1042.666667 1040.8 11.0 -14.0 0.066667 17.933333 0.0 7.866667 2.333333 10.200000 0.000000 0
1 2 4372114.0 48.386792 0.002616 0.002878 6.570382 2.663004 3000 3515 48 264.0 468.0 230.0 7 2 40 0 0 0 21 76.0 579.0 2 218.0 1097.0 0 509.0 0 0 36 0 27 1 0 18.549 21.261333 23.4312 21.253544 52.045615 1.095 1.179667 1.2474 1.159361 2.912661 0.114 0.331333 0.5052 0.513338 0.835188 1.050 5.425333 8.9256 7.381916 13.272984 0.225 0.239000 0.2502 0.233319 0.579768 5.725 5.692000 5.6656 5.343424 14.040295 1.842 1.797 1.7610 1.679805 4.441333 1.050 1.631667 2.097 1.999826 4.046478 0.675 0.450 0.270 0.266667 1.080000 6.773 4.515333 2.7092 2.675753 10.836800 47.093031 46.662021 46.317213 1052.0 1047.333333 1043.6 4.5 -12.0 -3.733333 21.733333 0.0 2.533333 0.000000 2.533333 10.333333 0
2 4 5056909.0 43.810967 0.000000 0.000000 6.664318 2.648816 2795 3149 53 324.0 622.0 225.0 10 0 30 0 1 0 33 65.0 374.0 4 207.0 631.0 0 624.0 0 0 32 0 38 1 0 18.549 18.549000 20.1764 20.185082 48.414821 1.095 1.095000 1.1458 1.133935 2.800558 0.114 0.114000 0.2444 0.355575 0.483068 1.050 1.050000 3.6752 4.880418 6.458274 0.225 0.225000 0.2334 0.230033 0.568717 5.725 5.725000 5.7052 5.494170 14.214506 1.842 1.842 1.8150 1.743882 4.546080 1.050 1.050000 1.399 1.624586 3.136732 0.675 0.675 0.540 0.427984 1.468800 6.773 6.773000 5.4184 4.294419 14.738048 47.093031 47.093031 46.834425 1052.0 1052.000000 1049.2 5.5 -9.0 -1.800000 19.800000 0.0 0.000000 0.000000 0.000000 0.000000 0
3 5 5380254.0 51.190178 0.002205 0.002940 6.340805 2.728826 4323 5100 47 342.0 941.0 251.0 5 1 37 0 0 0 50 77.0 564.0 4 250.0 1901.0 0 591.0 0 0 37 0 43 0 0 9.610 15.569333 16.7612 17.835064 38.658893 0.517 0.902333 0.9794 0.996838 2.197335 0.383 0.203667 0.1678 0.361669 0.672841 2.345 1.481667 1.3090 4.316992 6.219965 0.119 0.189667 0.2038 0.205359 0.460230 3.978 5.142667 5.3756 5.157243 12.506704 1.293 1.659 1.7322 1.643686 4.020648 0.975 1.025000 1.035 1.480233 2.857039 0.000 0.450 0.540 0.332876 0.881280 0.000 4.515333 5.4184 3.340103 8.842829 50.327406 48.171156 47.739906 1222.0 1108.666667 1086.0 13.0 -5.5 3.200000 14.800000 0.0 3.400000 0.000000 3.400000 0.000000 2
4 6 4728483.0 45.519178 0.000000 0.000000 6.309214 2.624675 2973 3501 48 302.0 822.0 213.0 2 1 35 0 0 1 35 88.0 485.0 5 241.0 710.0 0 540.0 0 0 43 0 40 1 1 9.610 12.589667 14.9734 16.007272 32.805336 0.517 0.709667 0.8638 0.890207 1.835401 0.383 0.293333 0.2216 0.366409 0.786704 2.345 1.913333 1.5680 3.878772 6.076979 0.119 0.154333 0.1826 0.186168 0.395138 3.978 4.560333 5.0262 4.895189 11.482022 1.293 1.476 1.6224 1.565756 3.705389 0.975 1.000000 1.020 1.367959 2.689224 0.000 0.225 0.405 0.258904 0.528768 0.000 2.257667 4.0638 2.597858 5.305697 50.327406 49.249281 48.386781 1222.0 1165.333333 1120.0 5.0 -18.0 -7.266667 25.266667 0.0 0.000000 2.666667 2.666667 2.000000 1

Sorting the dataframes based on Week# from July,2015 to June,2016

In [208]:
# String to Int

cameraaccessory_df['Week'] = cameraaccessory_df['Week'].astype('int64')
gamingaccessory_df['Week'] = gamingaccessory_df['Week'].astype('int64')
homeaudio_df['Week'] = homeaudio_df['Week'].astype('int64')
In [209]:
cameraaccessory_df1 = cameraaccessory_df.loc[cameraaccessory_df['Week'] >= 28]
cameraaccessory_df2 = cameraaccessory_df.loc[cameraaccessory_df['Week'] < 28]

gamingaccessory_df1 = gamingaccessory_df.loc[gamingaccessory_df['Week'] >= 28]
gamingaccessory_df2 = gamingaccessory_df.loc[gamingaccessory_df['Week'] < 28]

homeaudio_df1 = homeaudio_df.loc[homeaudio_df['Week'] >= 28]
homeaudio_df2 = homeaudio_df.loc[homeaudio_df['Week'] < 28]

cameraaccessory_df = cameraaccessory_df1.append(cameraaccessory_df2)
gamingaccessory_df = gamingaccessory_df1.append(gamingaccessory_df2)
homeaudio_df = homeaudio_df1.append(homeaudio_df2)

gamingaccessory_df.head()
Out[209]:
Week gmv Discount% deliverybdays deliverycdays sla product_procurement_sla is_cod is_mass_market product_vertical_gamecontrolmount product_vertical_gamepad product_vertical_gamingaccessorykit product_vertical_gamingadapter product_vertical_gamingchargingstation product_vertical_gamingheadset product_vertical_gamingkeyboard product_vertical_gamingmemorycard product_vertical_gamingmouse product_vertical_gamingmousepad product_vertical_gamingspeaker product_vertical_joystickgamingwheel product_vertical_motioncontroller product_vertical_tvoutcableaccessory payday_week holiday_week Total Investment Total Investment_SMA_3 Total Investment_SMA_5 Total Investment_EMA_8 Total_Investment_Ad_Stock TV TV_SMA_3 TV_SMA_5 TV_EMA_8 TV_Ad_Stock Digital Digital_SMA_3 Digital_SMA_5 Digital_EMA_8 Digital_Ad_Stock Sponsorship Sponsorship_SMA_3 Sponsorship_SMA_5 Sponsorship_EMA_8 Sponsorship_Ad_Stock Content Marketing Content Marketing_SMA_3 Content Marketing_SMA_5 Content Marketing_EMA_8 Content_Marketing_Ad_Stock Online marketing Online marketing_SMA_3 Online marketing_SMA_5 Online marketing_EMA_8 Online_marketing_Ad_Stock Affiliates Affiliates_SMA_3 Affiliates_SMA_5 Affiliates_EMA_8 Affiliates_Ad_Stock SEM SEM_SMA_3 SEM_SMA_5 SEM_EMA_8 SEM_Ad_Stock Radio Radio_SMA_3 Radio_SMA_5 Radio_EMA_8 Radio_Ad_Stock Other Other_SMA_3 Other_SMA_5 Other_EMA_8 Other_Ad_Stock NPS NPS_SMA_3 NPS_SMA_5 Stock Index Stock Index_SMA_3 Stock Index_SMA_5 Max Temp Min Temp Mean Temp Heat Deg Days Cool Deg Days Total Rain (mm) Total Snow (cm) Total Precip (mm) Snow on Grnd (cm) Sale
25 28 2.688958e+06 40.174910 0.000000 0.000000 5.429812 3.213343 2128 2625 0 930.0 104.0 49 0 206.0 422.0 36.0 955.0 56 0 14 27 79 0 0 4.265 0.000 0.0000 4.265000 4.265000 0.054 0.000000 0.0000 0.054000 0.054000 0.633 0.000000 0.0000 0.633000 0.633000 1.854 0.000 0.0000 1.854000 1.854000 0.0 0.0 0.0 0.0 0.0 0.332 0.000 0.0000 0.332 0.332000 0.137 0.000000 0.0000 0.137000 0.137000 1.256 0.000 0.0000 1.256000 1.256000 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 54.599588 0.000000 0.000000 1177.0 0.000000 0.0 28.0 12.5 20.100000 0.283333 2.383333 4.416667 0.0 4.416667 0.0 0
26 29 2.270363e+06 42.688931 0.000000 0.000000 5.326652 2.721232 2059 2458 0 1552.0 131.0 45 0 202.0 173.0 36.0 381.0 47 0 14 16 97 1 0 4.265 0.000 0.0000 4.265000 6.824000 0.054 0.000000 0.0000 0.054000 0.086400 0.633 0.000000 0.0000 0.633000 1.012800 1.854 0.000 0.0000 1.854000 2.966400 0.0 0.0 0.0 0.0 0.0 0.332 0.000 0.0000 0.332 0.531200 0.137 0.000000 0.0000 0.137000 0.219200 1.256 0.000 0.0000 1.256000 2.009600 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 54.599588 0.000000 0.000000 1177.0 0.000000 0.0 33.0 11.0 23.183333 0.000000 5.183333 1.400000 0.0 1.400000 0.0 2
27 30 2.588844e+06 37.110373 0.000000 0.000000 5.420082 2.603279 1873 2169 0 1044.0 165.0 38 0 292.0 188.0 29.0 484.0 68 0 17 21 94 0 0 4.265 4.265 0.0000 4.265000 8.359400 0.054 0.054000 0.0000 0.054000 0.105840 0.633 0.633000 0.0000 0.633000 1.240680 1.854 1.854 0.0000 1.854000 3.633840 0.0 0.0 0.0 0.0 0.0 0.332 0.332 0.0000 0.332 0.650720 0.137 0.137000 0.0000 0.137000 0.268520 1.256 1.256 0.0000 1.256000 2.461760 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 54.599588 54.599588 0.000000 1177.0 1177.000000 0.0 31.5 14.5 23.060000 0.000000 5.060000 1.080000 0.0 1.080000 0.0 0
28 31 1.900937e+06 42.873608 0.008576 0.010292 5.671241 3.188679 1390 1564 0 668.0 101.0 32 0 448.0 84.0 13.0 247.0 37 1 27 14 77 1 0 4.265 4.265 0.0000 4.265000 9.280640 0.054 0.054000 0.0000 0.054000 0.117504 0.633 0.633000 0.0000 0.633000 1.377408 1.854 1.854 0.0000 1.854000 4.034304 0.0 0.0 0.0 0.0 0.0 0.332 0.332 0.0000 0.332 0.722432 0.137 0.137000 0.0000 0.137000 0.298112 1.256 1.256 0.0000 1.256000 2.733056 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 54.599588 54.599588 0.000000 1177.0 1177.000000 0.0 33.5 16.0 24.566667 0.000000 6.566667 4.633333 0.0 4.633333 0.0 0
29 32 2.295000e+03 62.937500 0.000000 0.000000 7.250000 2.250000 3 4 0 2.0 0.0 0 0 1.0 0.0 0.0 0.0 0 0 0 0 1 0 0 1.013 3.181 3.6146 3.542333 6.581384 0.001 0.036333 0.0434 0.042222 0.071502 0.256 0.507333 0.5576 0.549222 1.082445 0.213 1.307 1.5258 1.489333 2.633582 0.0 0.0 0.0 0.0 0.0 0.026 0.230 0.2708 0.264 0.459459 0.015 0.096333 0.1126 0.109889 0.193867 0.503 1.005 1.1054 1.088667 2.142834 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 59.987101 56.395426 55.677091 1206.0 1186.666667 1182.8 28.5 15.0 21.650000 0.000000 3.650000 0.350000 0.0 0.350000 0.0 0
In [210]:
cameraaccessory_df.fillna(value=0, inplace=True)
gamingaccessory_df.fillna(value=0, inplace=True)
homeaudio_df.fillna(value=0, inplace=True)

cameraaccessory_df.head()
Out[210]:
Week gmv Discount% deliverybdays deliverycdays sla product_procurement_sla is_cod is_mass_market product_vertical_cameraaccessory product_vertical_camerabag product_vertical_camerabattery product_vertical_camerabatterycharger product_vertical_camerabatterygrip product_vertical_cameraeyecup product_vertical_camerafilmrolls product_vertical_camerahousing product_vertical_cameraledlight product_vertical_cameramicrophone product_vertical_cameramount product_vertical_cameraremotecontrol product_vertical_cameratripod product_vertical_extensiontube product_vertical_filter product_vertical_flash product_vertical_flashshoeadapter product_vertical_lens product_vertical_reflectorumbrella product_vertical_softbox product_vertical_strap product_vertical_teleconverter product_vertical_telescope payday_week holiday_week Total Investment Total Investment_SMA_3 Total Investment_SMA_5 Total Investment_EMA_8 Total_Investment_Ad_Stock TV TV_SMA_3 TV_SMA_5 TV_EMA_8 TV_Ad_Stock Digital Digital_SMA_3 Digital_SMA_5 Digital_EMA_8 Digital_Ad_Stock Sponsorship Sponsorship_SMA_3 Sponsorship_SMA_5 Sponsorship_EMA_8 Sponsorship_Ad_Stock Content Marketing Content Marketing_SMA_3 Content Marketing_SMA_5 Content Marketing_EMA_8 Content_Marketing_Ad_Stock Online marketing Online marketing_SMA_3 Online marketing_SMA_5 Online marketing_EMA_8 Online_marketing_Ad_Stock Affiliates Affiliates_SMA_3 Affiliates_SMA_5 Affiliates_EMA_8 Affiliates_Ad_Stock SEM SEM_SMA_3 SEM_SMA_5 SEM_EMA_8 SEM_Ad_Stock Radio Radio_SMA_3 Radio_SMA_5 Radio_EMA_8 Radio_Ad_Stock Other Other_SMA_3 Other_SMA_5 Other_EMA_8 Other_Ad_Stock NPS NPS_SMA_3 NPS_SMA_5 Stock Index Stock Index_SMA_3 Stock Index_SMA_5 Max Temp Min Temp Mean Temp Heat Deg Days Cool Deg Days Total Rain (mm) Total Snow (cm) Total Precip (mm) Snow on Grnd (cm) Sale
25 28 3975505.0 43.971082 0.000000 0.000000 6.959162 2.780803 1909 2549 46 333.0 690.0 223.0 18 0 0 0 0 0 4 141.0 435.0 0 220.0 26.0 0 489.0 0 0 13 0 16 0 0 4.265 0.000 0.0000 4.265000 4.265000 0.054 0.000000 0.0000 0.054000 0.054000 0.633 0.000000 0.0000 0.633000 0.633000 1.854 0.000 0.0000 1.854000 1.854000 0.0 0.0 0.0 0.0 0.0 0.332 0.000 0.0000 0.332 0.332000 0.137 0.000000 0.0000 0.137000 0.137000 1.256 0.000 0.0000 1.256000 1.256000 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 54.599588 0.000000 0.000000 1177.0 0.000000 0.0 28.0 12.5 20.100000 0.283333 2.383333 4.416667 0.0 4.416667 0.0 0
26 29 4390316.0 44.334867 0.000000 0.000000 6.395062 2.812865 2033 2729 36 356.0 752.0 225.0 9 0 0 0 0 0 11 142.0 495.0 0 186.0 27.0 0 526.0 0 0 24 0 22 1 0 4.265 0.000 0.0000 4.265000 6.824000 0.054 0.000000 0.0000 0.054000 0.086400 0.633 0.000000 0.0000 0.633000 1.012800 1.854 0.000 0.0000 1.854000 2.966400 0.0 0.0 0.0 0.0 0.0 0.332 0.000 0.0000 0.332 0.531200 0.137 0.000000 0.0000 0.137000 0.219200 1.256 0.000 0.0000 1.256000 2.009600 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 54.599588 0.000000 0.000000 1177.0 0.000000 0.0 33.0 11.0 23.183333 0.000000 5.183333 1.400000 0.0 1.400000 0.0 2
27 30 4368719.0 43.608101 0.002629 0.005258 6.414394 2.865922 2053 2705 48 295.0 773.0 267.0 12 0 0 0 0 0 7 108.0 465.0 0 186.0 31.0 0 573.0 0 0 23 0 30 0 0 4.265 4.265 0.0000 4.265000 8.359400 0.054 0.054000 0.0000 0.054000 0.105840 0.633 0.633000 0.0000 0.633000 1.240680 1.854 1.854 0.0000 1.854000 3.633840 0.0 0.0 0.0 0.0 0.0 0.332 0.332 0.0000 0.332 0.650720 0.137 0.137000 0.0000 0.137000 0.268520 1.256 1.256 0.0000 1.256000 2.461760 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 54.599588 54.599588 0.000000 1177.0 1177.000000 0.0 31.5 14.5 23.060000 0.000000 5.060000 1.080000 0.0 1.080000 0.0 0
28 31 2790458.0 42.999613 0.000000 0.000000 6.370904 2.823237 1406 1790 45 185.0 484.0 180.0 11 0 0 0 0 0 2 84.0 360.0 1 109.0 28.0 0 331.0 0 0 23 0 13 1 0 4.265 4.265 0.0000 4.265000 9.280640 0.054 0.054000 0.0000 0.054000 0.117504 0.633 0.633000 0.0000 0.633000 1.377408 1.854 1.854 0.0000 1.854000 4.034304 0.0 0.0 0.0 0.0 0.0 0.332 0.332 0.0000 0.332 0.722432 0.137 0.137000 0.0000 0.137000 0.298112 1.256 1.256 0.0000 1.256000 2.733056 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 54.599588 54.599588 0.000000 1177.0 1177.000000 0.0 33.5 16.0 24.566667 0.000000 6.566667 4.633333 0.0 4.633333 0.0 0
29 32 2198.0 12.947500 0.000000 0.000000 9.000000 4.000000 1 4 0 0.0 3.0 0.0 0 0 0 0 0 0 0 0.0 1.0 0 0.0 0.0 0 0.0 0 0 0 0 0 0 0 1.013 3.181 3.6146 3.542333 6.581384 0.001 0.036333 0.0434 0.042222 0.071502 0.256 0.507333 0.5576 0.549222 1.082445 0.213 1.307 1.5258 1.489333 2.633582 0.0 0.0 0.0 0.0 0.0 0.026 0.230 0.2708 0.264 0.459459 0.015 0.096333 0.1126 0.109889 0.193867 0.503 1.005 1.1054 1.088667 2.142834 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 59.987101 56.395426 55.677091 1206.0 1186.666667 1182.8 28.5 15.0 21.650000 0.000000 3.650000 0.350000 0.0 0.350000 0.0 0
In [211]:
print('Shape of cameraaccessory_df: {}'.format(cameraaccessory_df.shape))
print('Shape of gamingaccessory_df: {}'.format(gamingaccessory_df.shape))
print('Shape of homeaudio_df: {}'.format(homeaudio_df.shape))
Shape of cameraaccessory_df: (50, 100)
Shape of gamingaccessory_df: (51, 91)
Shape of homeaudio_df: (48, 87)

EDA & Visualization

In [212]:
main_df.shape
Out[212]:
(1464190, 21)

Creating a dataframe with required 3 product sub categories from the main dataframe

In [213]:
product_sub = main_df.loc[(main_df['product_analytic_sub_category'] == 'cameraaccessory')| \
                        (main_df['product_analytic_sub_category'] == 'gamingaccessory')| \
                        (main_df['product_analytic_sub_category'] == 'homeaudio')]

product_sub.shape
Out[213]:
(512838, 21)

Comparing Distribution of Discount% for 2 product types - Luxury and Mass Market

In [214]:
# Slightly alter the figure size to make it more horizontal.
plt.figure(figsize=(3, 3), dpi=100, facecolor='w', edgecolor='k', frameon='True')
sns.set_style("white") # white/whitegrid/dark/ticks
sns.set_context("paper") # talk/poster

sns.boxplot(x='product_type', y='Discount%', palette='coolwarm', data=product_sub)
# plot legend

# Automatically adjust subplot params so that the subplotS fits in to the figure area.
plt.tight_layout()

# display the plot
plt.show()

The median discount percentage offered for luxury items is less compared to that of Mass Market Products.

This is a known trend among luxury products or luxury brands to offer limited or no discounts to retain the exclusivity of their products.

Displaying trend of NPS and Stock Index by week

In [215]:
# Slightly alter the figure size to make it more horizontal.
plt.figure(figsize=(10, 4), dpi=100, facecolor='w', edgecolor='k', frameon='True')
sns.set_style("white") # white/whitegrid/dark/ticks
sns.set_context("paper") # talk/poster

# subplot 1
plt.subplot(2, 1, 1)
plt.plot(net_promoter_score.iloc[:,0], net_promoter_score.iloc[:,1], 'g-', linewidth=2)
plt.xlabel('Week', fontsize=10);
plt.ylabel('NPS', fontsize=10);

# subplot 2
plt.subplot(2, 1, 2)
plt.plot(net_promoter_score.iloc[:,0], net_promoter_score.iloc[:,4], 'r-', linewidth=2)
plt.xlabel('Week', fontsize=10);
plt.ylabel('Stock Index', fontsize=10);

# Automatically adjust subplot params so that the subplotS fits in to the figure area.
plt.tight_layout()

# display the plot
plt.show()

Consumer NPS score is highest in weeks 32 – 35 , which coincides with the time when maximum discounts were being offered.

Displaying trend of various Media Channel Investments by week

In [218]:
# Slightly alter the figure size to make it more horizontal.
plt.figure(figsize=(8, 5), dpi=100, facecolor='w', edgecolor='k', frameon='True')
sns.set_style("white") # white/whitegrid/dark/ticks
sns.set_context("paper") # talk/poster

# subplot 1
plt.subplot(1, 1, 1)
plt.plot(media_investment.iloc[:,0], media_investment.iloc[:,1::5], linewidth=3, alpha = 0.7)
plt.xlabel('Week#', fontsize=10)
plt.ylabel('Media Channel Investments', fontsize=10)
plt.legend(media_investment.iloc[:,1::5])
# Automatically adjust subplot params so that the subplotS fits in to the figure area.
plt.tight_layout()

# display the plot
plt.show()

Over the past year, bulk of the Ad Investment has been made in Sponsorships followed by Online Marketing & Search Engine Marketing(specially during Thanksgiving).

Average Revenue from Holiday/Non-holiday days for the 3 product subcategories

In [219]:
# Slightly alter the figure size to make it more horizontal.
plt.figure(figsize=(6,4), dpi=100, facecolor='w', edgecolor='k', frameon='True')
sns.set_style("white") # white/whitegrid/dark/ticks
sns.set_context("paper") # talk/poster

sns.barplot(y='gmv', x='product_analytic_sub_category', hue ='occassion_flag', \
            palette='husl', data=product_sub, estimator=np.median)

# plot legend
plt.legend(frameon=True, fontsize='small', shadow='True', title='Is a Holiday? 0-No 1-Yes', bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)

# Automatically adjust subplot params so that the subplotS fits in to the figure area.
plt.tight_layout()

# display the plot
plt.show()

The average(median) revenue for 3 product sub categories from holiday and non-holiday days are more or less comparable.

No of items(Luxury/Mass-market) sold per 3 product subcategories

In [220]:
product_sub.groupby(["product_type", "product_analytic_sub_category"]).size().unstack().plot(kind='bar', \
                                                                                             stacked=True, figsize=(8,6), \
                                                                                             fontsize = 10) 
# plot x axis label
plt.xlabel('Product Type', fontsize = 12)
# plot y axis label
plt.ylabel('No of Items Sold', fontsize = 12)
# plot legend
plt.legend(frameon=True, fontsize='small', shadow='True', title='Product Sub-category', \
           bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)

# Automatically adjust subplot params so that the subplotS fits in to the figure area.
plt.tight_layout()

# display the plot
plt.show()

From the above graph, we have the following observations:

  • Most of the units sold belonged to the mass market category.
  • Among mass market products sold, Camera and Gaming Accessories related products were sold the most.
  • Home Audio products were the most popular among the luxury products sold

Total items sold per 3 product subcategories per Month

In [221]:
product_sub.groupby(["Month", "product_analytic_sub_category"]).size().unstack().plot(kind='bar', \
                                                                                             stacked=True, figsize=(8,6), \
                                                                                           fontsize = 10) 
# plot x axis label
plt.xlabel('Month', fontsize = 12)
# plot y axis label
plt.ylabel('No of Items Sold', fontsize = 12)
# plot legend
plt.legend(frameon=True, fontsize='small', shadow='True', title='Product Sub-category', \
           bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)


# Automatically adjust subplot params so that the subplotS fits in to the figure area.
plt.tight_layout()

# display the plot
plt.show()

Total items sold per 3 product subcategories per Week

In [222]:
product_sub.groupby(["Week", "product_analytic_sub_category"]).size().unstack().plot(kind='bar', \
                                                                                             stacked=True, figsize=(15,6), \
                                                                                           fontsize = 10) 
# plot x axis label
plt.xlabel('Week', fontsize = 12)
# plot y axis label
plt.ylabel('No of Items Sold', fontsize = 12)
# plot legend
plt.legend(frameon=True, fontsize='small', shadow='True', title='Product Sub-category', \
           bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)

# Automatically adjust subplot params so that the subplotS fits in to the figure area.
plt.tight_layout()

# display the plot
plt.show()

The sale on the 42nd week (Thanksgiving week) is maximum. Overall, October has seen most no of items being sold.

Top 10 Product Verticals which brought the Maximum Revenue for 3 product sub-categories

In [223]:
highest_gmv = pd.DataFrame(product_sub.groupby(['product_analytic_sub_category','product_analytic_vertical'])['gmv'].sum().sort_values(ascending=False).reset_index()).head(10)
highest_gmv['product_in_category'] = highest_gmv[['product_analytic_vertical','product_analytic_sub_category']].apply(lambda x: ' in '.join(x), axis=1)
highest_gmv.head(20)
Out[223]:
product_analytic_sub_category product_analytic_vertical gmv product_in_category
0 homeaudio homeaudiospeaker 1.873206e+08 homeaudiospeaker in homeaudio
1 cameraaccessory lens 1.085308e+08 lens in cameraaccessory
2 gamingaccessory gamepad 6.187440e+07 gamepad in gamingaccessory
3 gamingaccessory gamingheadset 3.199049e+07 gamingheadset in gamingaccessory
4 cameraaccessory binoculars 2.658427e+07 binoculars in cameraaccessory
5 gamingaccessory gamingmouse 2.632837e+07 gamingmouse in gamingaccessory
6 cameraaccessory camerabattery 2.356174e+07 camerabattery in cameraaccessory
7 cameraaccessory camerabag 2.249499e+07 camerabag in cameraaccessory
8 cameraaccessory flash 2.228150e+07 flash in cameraaccessory
9 homeaudio fmradio 2.222170e+07 fmradio in homeaudio
In [224]:
import squarify    # pip install squarify (algorithm for treemap)
# Create a dataset:
my_values=list(highest_gmv['gmv'])
 
plt.figure(figsize=(12,6), dpi=80, facecolor='w', edgecolor='k')    
# create a color palette, mapped to these values
cmap = matplotlib.cm.Blues
mini=min(my_values)
maxi=max(my_values)
norm = matplotlib.colors.Normalize(vmin=mini, vmax=maxi)
colors = [cmap(norm(value)) for value in my_values]
 
# Change color
squarify.plot(sizes=my_values, alpha=.8, label=highest_gmv['product_in_category'],color=colors)
plt.axis('off')

# Show graphic
plt.tight_layout()
plt.show()

homeaudiospeaker in homeaudio brought the largest revenue followed by lens in cameraaccessory & gamepad in gamingaccessory.

Top 10 Product Verticals with most no of sales for 3 product sub-categories

In [225]:
most_sales = pd.DataFrame(product_sub.groupby(['product_analytic_sub_category','product_analytic_vertical'])['units'].count().sort_values(ascending=False).reset_index()).head(10)
most_sales['product_in_category'] = most_sales[['product_analytic_vertical','product_analytic_sub_category']].apply(lambda x: ' in '.join(x), axis=1)
most_sales.head(20)
Out[225]:
product_analytic_sub_category product_analytic_vertical units product_in_category
0 homeaudio homeaudiospeaker 76581 homeaudiospeaker in homeaudio
1 gamingaccessory gamingheadset 59928 gamingheadset in gamingaccessory
2 gamingaccessory gamepad 52437 gamepad in gamingaccessory
3 cameraaccessory flash 47808 flash in cameraaccessory
4 gamingaccessory gamingmouse 35470 gamingmouse in gamingaccessory
5 cameraaccessory camerabattery 35107 camerabattery in cameraaccessory
6 cameraaccessory lens 32350 lens in cameraaccessory
7 cameraaccessory cameratripod 31220 cameratripod in cameraaccessory
8 homeaudio fmradio 24681 fmradio in homeaudio
9 cameraaccessory camerabag 15842 camerabag in cameraaccessory
In [226]:
import squarify    # pip install squarify (algorithm for treemap)
# Create a dataset:
my_values=list(most_sales['units'])
 
plt.figure(figsize=(12,6), dpi=80, facecolor='w', edgecolor='k')    
# create a color palette, mapped to these values
cmap = matplotlib.cm.Blues
mini=min(my_values)
maxi=max(my_values)
norm = matplotlib.colors.Normalize(vmin=mini, vmax=maxi)
colors = [cmap(norm(value)) for value in my_values]
 
# Change color
squarify.plot(sizes=my_values, alpha=.8, label=most_sales['product_in_category'],color=colors)
plt.axis('off')

# Show graphic
plt.tight_layout()
plt.show()

homeaudiospeaker in homeaudio had the most no of sales followed by gamingheadset & gamepad in gamingaccessory.

In [227]:
cameraaccessory = cameraaccessory_df.copy()
gamingaccessory = gamingaccessory_df.copy()
homeaudio = homeaudio_df.copy()

cameraaccessory['Week'] = cameraaccessory['Week'].apply(str)
gamingaccessory['Week'] = gamingaccessory['Week'].apply(str)
homeaudio['Week'] = homeaudio['Week'].apply(str)
In [228]:
# Slightly alter the figure size to make it more horizontal.
plt.figure(figsize=(10, 8), dpi=100, facecolor='w', edgecolor='k', frameon='True')
sns.set_style("white") # white/whitegrid/dark/ticks
sns.set_context("paper") # talk/poster

# subplot 1
plt.subplot(2, 1, 1)
plt.plot(gamingaccessory.iloc[:,0], gamingaccessory.iloc[:,1], 'b-', linewidth=2, alpha=0.7)
plt.plot(cameraaccessory.iloc[:,0], cameraaccessory.iloc[:,1], 'r-', linewidth=2, alpha=0.7)
plt.plot(homeaudio.iloc[:,0], homeaudio.iloc[:,1], 'g-', linewidth=2, alpha=0.7)
plt.xlabel('Week', fontsize=10);
plt.ylabel('Total GMV', fontsize=10);
plt.legend(['gamingaccessory gmv','cameraaccessory gmv','homeaudio gmv'])

# subplot 2
plt.subplot(2, 1, 2)
plt.plot(gamingaccessory.iloc[:,0], gamingaccessory.iloc[:,2], 'b-', linewidth=2, alpha=0.7)
plt.plot(cameraaccessory.iloc[:,0], cameraaccessory.iloc[:,2], 'r-', linewidth=2, alpha=0.7)
plt.plot(homeaudio.iloc[:,0], homeaudio.iloc[:,2], 'g-', linewidth=2, alpha=0.7)
plt.plot(homeaudio.iloc[:,0], homeaudio.iloc[:,21], 'c-', linewidth=2, alpha=0.7)
plt.xlabel('Week', fontsize=10);
plt.ylabel('Promotion & Ads', fontsize=10);
plt.legend(['gamingaccessory disount','cameraaccessory disount','homeaudio disount','total ad investment'])

# Automatically adjust subplot params so that the subplotS fits in to the figure area.
plt.tight_layout()

# display the plot
plt.show()

The following observations were noted from the above time series plots:

  • For the week# 42 (during Thanksgiving), all the graphs show a steep rise. Revenue increased becuase of both higher discount% and increased Ad Investment.

  • For the week 32(August), Revenue generated was the lowest from all 3 product subcategories. This can be observed as a direct relation to minimum amount of total investment in Ads. Discount was also lowest for all products apart from camera accessories. Post this dip in revenue, discount% was increased to bring about higher sales. This increase in Discount% was observed most in the case of gaming accessories. However, barring home audio products, the revenue from other products was seen to be constant for the next 3 weeks after which, the revenue started to pick up.

  • In general the average disount% offered for home audio products is lesser compared to that of the other product subcategories.

Analyzing how Sales Amount and Revenue vary based on Discount%

In [229]:
# Segmenting the Discount% into various bins

def discount_binning(df,cut_points,label_names):
    column_index = df.columns.get_loc('Discount%') + 1
    df.insert(loc=column_index,column='Discount Bins',value=pd.cut(df['Discount%'],cut_points,labels=label_names, include_lowest=True))
    return df

cut_points = [0,10,20,30,40,50,60,70,80,90,100]
label_names = ["Below 10%","Between 10-20%","Between 20-30%","Between 30-40%","Between 40-50%","Between 50-60%", \
              "Between 60-70%","Between 70-80%","Between 80-90%","Between 90-100%"]

product_sub = discount_binning(product_sub,cut_points,label_names)
product_sub.head()
Out[229]:
order_date Year Month Week gmv list_price Discount% Discount Bins units deliverybdays deliverycdays s1_fact.order_payment_type sla pincode product_analytic_category product_analytic_sub_category product_analytic_vertical product_mrp product_procurement_sla payday_flag occassion_flag product_type
0 2015-10-17 15:11:54 2015 10 42 6400.0 6400.0 10.99 Between 10-20% 1 0.0 0.0 cod 5 -7.79175582905735E+018 cameraaccessory cameraaccessory cameratripod 7190.0 0 0 0 luxury
1 2015-10-19 10:07:22 2015 10 43 6900.0 6900.0 4.03 Below 10% 1 0.0 0.0 cod 7 7.33541149097431E+018 cameraaccessory cameraaccessory cameratripod 7190.0 0 0 0 luxury
2 2015-10-20 15:45:56 2015 10 43 1990.0 1990.0 5.19 Below 10% 1 0.0 0.0 cod 10 -7.47768776228657E+018 cameraaccessory cameraaccessory cameratripod 2099.0 3 0 0 mass_market
3 2015-10-14 12:05:15 2015 10 42 1690.0 1690.0 19.49 Between 10-20% 1 0.0 0.0 prepaid 4 -5.83593163877661E+018 cameraaccessory cameraaccessory cameratripod 2099.0 3 1 0 mass_market
4 2015-10-17 21:25:03 2015 10 42 1618.0 1618.0 22.92 Between 20-30% 1 0.0 0.0 prepaid 6 5.34735360997242E+017 cameraaccessory cameraaccessory cameratripod 2099.0 3 0 0 mass_market
In [230]:
# Slightly alter the figure size to make it more horizontal.
plt.figure(figsize=(10,4), dpi=100, facecolor='w', edgecolor='k', frameon='True')
sns.set_style("white") # white/whitegrid/dark/ticks
sns.set_context("paper") # talk/poster

# subplot 1
plt.subplot(1, 2, 1)
sns.barplot(x='gmv', y='Discount Bins', palette='husl', data=product_sub, estimator=np.median)

# subplot 2
plt.subplot(1, 2, 2)
sns.countplot(y='Discount Bins', palette='husl', data=product_sub)

# Automatically adjust subplot params so that the subplotS fits in to the figure area.
plt.tight_layout()

# display the plot
plt.show()
  • Median Revenue is maximum when Average discount% is between 10-20%. But beyond that, average revenue slowly starts to decline.

  • The sales on the other hand shows a steady increase with increase in Discount percentage till it peaks at 50-60% after which it starts to fall again.

This shows that at higher discount, although the sales are good, the revenue collapses signifying a loss for the company. An average discount of 10-20% is the most profitable for the company.

Percentage of items sold at different Discount% segments

In [231]:
from collections import Counter

labels, values = zip(*Counter(product_sub["Discount Bins"]).items())
colors = ["#ff9999", "#99d8c9", "#2ca25f", "#8856a7","#43a2ca","#fdbb84","#e34a33","#bdbdbd","#636363","#f7fcb9"]
piechart_df = (pd.DataFrame(list(values),list(labels)))
piechart_df = piechart_df.reset_index()

fig = plt.figure(figsize=[6,6])

plt.pie(piechart_df[0],labels=piechart_df["index"],startangle=180,explode=(0,0,0,0.1,0,0,0,0,0,0),autopct="%1.1f%%", \
        shadow=True, colors=colors)
plt.tight_layout()
plt.title("No of Items sold at Different Discount%", fontsize=15)
plt.show()

Most of the sales take place when Discount% is between 50-60%.

Analyzing how Sales Amount and Revenue vary based on Payment Types

In [232]:
# Slightly alter the figure size to make it more horizontal.
plt.figure(figsize=(10,4), dpi=100, facecolor='w', edgecolor='k', frameon='True')
sns.set_style("white") # white/whitegrid/dark/ticks
sns.set_context("paper") # talk/poster

# subplot 1
plt.subplot(1, 2, 1)
sns.barplot(x='gmv', y='product_analytic_sub_category', hue ='s1_fact.order_payment_type', \
            palette='coolwarm', data=product_sub, estimator=np.median)

# subplot 2
plt.subplot(1, 2, 2)
sns.countplot(y='product_analytic_sub_category', hue ='s1_fact.order_payment_type', palette='coolwarm', data=product_sub)

# plot legend
plt.legend(frameon=True, fontsize='small', shadow='True', title='Payment Type', bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)

# Automatically adjust subplot params so that the subplotS fits in to the figure area.
plt.tight_layout()

# display the plot
plt.show()

Except for Home Audio Products, for the other 2 product sub categories, we observe that the median Revenue from Prepaid orders is more than that from COD products even though the no of products sold is way higher in case of COD products for all categories.

Finding the percentage of Luxury & Mass-market Products from 3 sub-categories

In [233]:
product_sub['luxury'] = product_sub['product_type'].apply(lambda x:1 if x=='luxury' else 0)
product_sub['mass_market'] = product_sub['product_type'].apply(lambda x:1 if x=='mass_market' else 0)
product_sub.head()
Out[233]:
order_date Year Month Week gmv list_price Discount% Discount Bins units deliverybdays deliverycdays s1_fact.order_payment_type sla pincode product_analytic_category product_analytic_sub_category product_analytic_vertical product_mrp product_procurement_sla payday_flag occassion_flag product_type luxury mass_market
0 2015-10-17 15:11:54 2015 10 42 6400.0 6400.0 10.99 Between 10-20% 1 0.0 0.0 cod 5 -7.79175582905735E+018 cameraaccessory cameraaccessory cameratripod 7190.0 0 0 0 luxury 1 0
1 2015-10-19 10:07:22 2015 10 43 6900.0 6900.0 4.03 Below 10% 1 0.0 0.0 cod 7 7.33541149097431E+018 cameraaccessory cameraaccessory cameratripod 7190.0 0 0 0 luxury 1 0
2 2015-10-20 15:45:56 2015 10 43 1990.0 1990.0 5.19 Below 10% 1 0.0 0.0 cod 10 -7.47768776228657E+018 cameraaccessory cameraaccessory cameratripod 2099.0 3 0 0 mass_market 0 1
3 2015-10-14 12:05:15 2015 10 42 1690.0 1690.0 19.49 Between 10-20% 1 0.0 0.0 prepaid 4 -5.83593163877661E+018 cameraaccessory cameraaccessory cameratripod 2099.0 3 1 0 mass_market 0 1
4 2015-10-17 21:25:03 2015 10 42 1618.0 1618.0 22.92 Between 20-30% 1 0.0 0.0 prepaid 6 5.34735360997242E+017 cameraaccessory cameraaccessory cameratripod 2099.0 3 0 0 mass_market 0 1
In [234]:
product_type = pd.DataFrame(product_sub.groupby('product_analytic_sub_category')['luxury','mass_market'].sum().reset_index())
product_type
Out[234]:
product_analytic_sub_category luxury mass_market
0 cameraaccessory 18423 197478
1 gamingaccessory 13007 172869
2 homeaudio 36747 74314
In [235]:
# From raw value to percentage
r = [0,1,2]
totals = [i+j for i,j in zip(product_type['luxury'], product_type['mass_market'])]
luxury = [i / j * 100 for i,j in zip(product_type['luxury'], totals)]
mass_market = [i / j * 100 for i,j in zip(product_type['mass_market'], totals)]
names = list(product_type['product_analytic_sub_category'])

# plot
# adjust figure size
plt.figure(figsize=(6,6), dpi=80, facecolor='w', edgecolor='k')

barWidth = 0.85
# Create Luxury Bars
plt.bar(r, luxury, color='#b5ffb9', edgecolor='white', width=barWidth)
# Create mass_market Bars
plt.bar(r, mass_market, bottom=[i for i in luxury], color='#a3acff', edgecolor='white', width=barWidth)
# Custom x axis
plt.xticks(r, names, rotation='vertical')
plt.legend(['luxury','mass_market'],frameon=True, fontsize='small', shadow='True', title='Product Type', bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)
plt.title("Percentage of Luxury & Mass-market Product for different Sub-categories")

# Show graphic
plt.tight_layout()
plt.show()

Percentage of luxury products under HomeAudio is much more compared to the other sub categories.

Finding the percentage of COD & Prepaid Products from 3 sub-categories

In [236]:
product_sub['prepaid'] = product_sub['s1_fact.order_payment_type'].apply(lambda x:1 if x=='prepaid' else 0)
product_sub['cod'] = product_sub['s1_fact.order_payment_type'].apply(lambda x:1 if x=='cod' else 0)
product_sub.head()
Out[236]:
order_date Year Month Week gmv list_price Discount% Discount Bins units deliverybdays deliverycdays s1_fact.order_payment_type sla pincode product_analytic_category product_analytic_sub_category product_analytic_vertical product_mrp product_procurement_sla payday_flag occassion_flag product_type luxury mass_market prepaid cod
0 2015-10-17 15:11:54 2015 10 42 6400.0 6400.0 10.99 Between 10-20% 1 0.0 0.0 cod 5 -7.79175582905735E+018 cameraaccessory cameraaccessory cameratripod 7190.0 0 0 0 luxury 1 0 0 1
1 2015-10-19 10:07:22 2015 10 43 6900.0 6900.0 4.03 Below 10% 1 0.0 0.0 cod 7 7.33541149097431E+018 cameraaccessory cameraaccessory cameratripod 7190.0 0 0 0 luxury 1 0 0 1
2 2015-10-20 15:45:56 2015 10 43 1990.0 1990.0 5.19 Below 10% 1 0.0 0.0 cod 10 -7.47768776228657E+018 cameraaccessory cameraaccessory cameratripod 2099.0 3 0 0 mass_market 0 1 0 1
3 2015-10-14 12:05:15 2015 10 42 1690.0 1690.0 19.49 Between 10-20% 1 0.0 0.0 prepaid 4 -5.83593163877661E+018 cameraaccessory cameraaccessory cameratripod 2099.0 3 1 0 mass_market 0 1 1 0
4 2015-10-17 21:25:03 2015 10 42 1618.0 1618.0 22.92 Between 20-30% 1 0.0 0.0 prepaid 6 5.34735360997242E+017 cameraaccessory cameraaccessory cameratripod 2099.0 3 0 0 mass_market 0 1 1 0
In [237]:
payment_type = pd.DataFrame(product_sub.groupby('product_analytic_sub_category')['prepaid','cod'].sum().reset_index())
payment_type
Out[237]:
product_analytic_sub_category prepaid cod
0 cameraaccessory 65462 150439
1 gamingaccessory 45273 140603
2 homeaudio 27123 83938
In [238]:
# From raw value to percentage
r = [0,1,2]
totals = [i+j for i,j in zip(payment_type['prepaid'], payment_type['cod'])]
prepaid = [i / j * 100 for i,j in zip(payment_type['prepaid'], totals)]
cod = [i / j * 100 for i,j in zip(payment_type['cod'], totals)]
names = list(payment_type['product_analytic_sub_category'])

# plot
# adjust figure size
plt.figure(figsize=(6,6), dpi=80, facecolor='w', edgecolor='k')

barWidth = 0.85
# Create prepaid Bars
plt.bar(r, prepaid, color='#b5ffb9', edgecolor='white', width=barWidth)
# Create cod Bars
plt.bar(r, cod, bottom=[i for i in prepaid], color='#a3acff', edgecolor='white', width=barWidth)
# Custom x axis
plt.xticks(r, names, rotation='vertical')
plt.legend(['prepaid','cod'],frameon=True, fontsize='small', shadow='True', title='Payment Type', bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)
plt.title("Percentage of Prepaid & COD Products for different Sub-categories")

# Show graphic
plt.tight_layout()
plt.show()

Percentage of prepaid payments under Camera Accessory was observed to be slightly more compared to that of the other sub categories.

Finding the percentage of Luxury and Mass_market Products under different Discount groups

In [239]:
product_type_with_discount = pd.DataFrame(product_sub.groupby('Discount Bins')['luxury','mass_market'].sum().reset_index())
product_type_with_discount
Out[239]:
Discount Bins luxury mass_market
0 Below 10% 9110 34093
1 Between 10-20% 14604 31972
2 Between 20-30% 10077 48907
3 Between 30-40% 8593 56255
4 Between 40-50% 7477 57617
5 Between 50-60% 13335 73808
6 Between 60-70% 4847 51506
7 Between 70-80% 68 41789
8 Between 80-90% 48 39781
9 Between 90-100% 18 8933
In [240]:
# From raw value to percentage
r = [0,1,2,3,4,5,6,7,8,9]
totals = [i+j for i,j in zip(product_type_with_discount['luxury'], product_type_with_discount['mass_market'])]
luxury = [i / j * 100 for i,j in zip(product_type_with_discount['luxury'], totals)]
mass_market = [i / j * 100 for i,j in zip(product_type_with_discount['mass_market'], totals)]
names = list(product_type_with_discount['Discount Bins'])

# plot
# adjust figure size
plt.figure(figsize=(10,6), dpi=80, facecolor='w', edgecolor='k')

barWidth = 0.85
# Create Luxury Bars
plt.bar(r, luxury, color='#b5ffb9', edgecolor='white', width=barWidth)
# Create mass_market Bars
plt.bar(r, mass_market, bottom=[i for i in luxury], color='#a3acff', edgecolor='white', width=barWidth)
# Custom x axis
plt.xticks(r, names, rotation='vertical')
plt.legend(['luxury','mass_market'],frameon=True, fontsize='small', shadow='True', title='Product Type', bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)
plt.title("Percentage of Luxury & Mass-market Product under different Discount Group")

# Show graphic
plt.tight_layout()
plt.show()

Percentage of luxury products were given a discount between 10-20%.

Relationship between Revenue and Advertisement Spends

In [241]:
# Slightly alter the figure size to make it more horizontal.
#plt.figure(figsize=(10,4), dpi=100, facecolor='w', edgecolor='k', frameon='True')
sns.set_style("white") # white/whitegrid/dark/ticks
sns.set_context("paper") # talk/poster


sns.pairplot(cameraaccessory, x_vars=['Total Investment', 'TV', 'Digital', 'Sponsorship', 'Content Marketing', \
                                      'Online marketing', 'Affiliates', 'SEM', 'Radio', 'Other'], \
             y_vars='gmv',size=4, aspect=0.5, kind='reg')
plt.title('cameraaccessory', fontsize = 20)

sns.pairplot(gamingaccessory, x_vars=['Total Investment', 'TV', 'Digital', 'Sponsorship', 'Content Marketing', \
                                      'Online marketing', 'Affiliates', 'SEM', 'Radio', 'Other'], \
             y_vars='gmv',size=4, aspect=0.5, kind='reg')
plt.title('gamingaccessory', fontsize = 20)

sns.pairplot(homeaudio, x_vars=['Total Investment', 'TV', 'Digital', 'Sponsorship', 'Content Marketing', \
                                      'Online marketing', 'Affiliates', 'SEM', 'Radio', 'Other'], \
             y_vars='gmv',size=4, aspect=0.5, kind='reg')
plt.title('homeaudio', fontsize = 20)

# Automatically adjust subplot params so that the subplotS fits in to the figure area.
plt.tight_layout()

# display the plot
plt.show()
In [243]:
rev_ad_columns = ['gmv','Total Investment', 'Total Investment_SMA_3', 'Total Investment_SMA_5', 'Total Investment_EMA_8', \
                  'Total_Investment_Ad_Stock', 'TV', 'TV_SMA_3', 'TV_SMA_5', 'TV_EMA_8', 'TV_Ad_Stock', 'Digital', \
                  'Digital_SMA_3', 'Digital_SMA_5', 'Digital_EMA_8', 'Digital_Ad_Stock', 'Sponsorship', 'Sponsorship_SMA_3', \
                  'Sponsorship_SMA_5', 'Sponsorship_EMA_8', 'Sponsorship_Ad_Stock', 'Content Marketing', \
                  'Content Marketing_SMA_3','Content Marketing_SMA_5','Content Marketing_EMA_8', 'Content_Marketing_Ad_Stock', \
                  'Online marketing', 'Online marketing_SMA_3', 'Online marketing_SMA_5', 'Online marketing_EMA_8', \
                  'Online_marketing_Ad_Stock', 'Affiliates', 'Affiliates_SMA_3', 'Affiliates_SMA_5', 'Affiliates_EMA_8', \
                  'Affiliates_Ad_Stock', 'SEM', 'SEM_SMA_3', 'SEM_SMA_5', 'SEM_EMA_8', 'SEM_Ad_Stock', 'Radio', 'Radio_SMA_3', \
                  'Radio_SMA_5', 'Radio_EMA_8', 'Radio_Ad_Stock', 'Other', 'Other_SMA_3', 'Other_SMA_5', 'Other_EMA_8', \
                  'Other_Ad_Stock']
In [244]:
cameraaccessory_ad = cameraaccessory[rev_ad_columns]
gamingaccessory_ad = gamingaccessory[rev_ad_columns]
homeaudio_ad = homeaudio[rev_ad_columns]
homeaudio_ad.head()
Out[244]:
gmv Total Investment Total Investment_SMA_3 Total Investment_SMA_5 Total Investment_EMA_8 Total_Investment_Ad_Stock TV TV_SMA_3 TV_SMA_5 TV_EMA_8 TV_Ad_Stock Digital Digital_SMA_3 Digital_SMA_5 Digital_EMA_8 Digital_Ad_Stock Sponsorship Sponsorship_SMA_3 Sponsorship_SMA_5 Sponsorship_EMA_8 Sponsorship_Ad_Stock Content Marketing Content Marketing_SMA_3 Content Marketing_SMA_5 Content Marketing_EMA_8 Content_Marketing_Ad_Stock Online marketing Online marketing_SMA_3 Online marketing_SMA_5 Online marketing_EMA_8 Online_marketing_Ad_Stock Affiliates Affiliates_SMA_3 Affiliates_SMA_5 Affiliates_EMA_8 Affiliates_Ad_Stock SEM SEM_SMA_3 SEM_SMA_5 SEM_EMA_8 SEM_Ad_Stock Radio Radio_SMA_3 Radio_SMA_5 Radio_EMA_8 Radio_Ad_Stock Other Other_SMA_3 Other_SMA_5 Other_EMA_8 Other_Ad_Stock
25 4.573783e+06 4.265 0.000 0.0000 4.265000 4.265000 0.054 0.000000 0.0000 0.054000 0.054000 0.633 0.000000 0.0000 0.633000 0.633000 1.854 0.000 0.0000 1.854000 1.854000 0.0 0.0 0.0 0.0 0.0 0.332 0.000 0.0000 0.332 0.332000 0.137 0.000000 0.0000 0.137000 0.137000 1.256 0.000 0.0000 1.256000 1.256000 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
26 5.371525e+06 4.265 0.000 0.0000 4.265000 6.824000 0.054 0.000000 0.0000 0.054000 0.086400 0.633 0.000000 0.0000 0.633000 1.012800 1.854 0.000 0.0000 1.854000 2.966400 0.0 0.0 0.0 0.0 0.0 0.332 0.000 0.0000 0.332 0.531200 0.137 0.000000 0.0000 0.137000 0.219200 1.256 0.000 0.0000 1.256000 2.009600 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
27 4.679828e+06 4.265 4.265 0.0000 4.265000 8.359400 0.054 0.054000 0.0000 0.054000 0.105840 0.633 0.633000 0.0000 0.633000 1.240680 1.854 1.854 0.0000 1.854000 3.633840 0.0 0.0 0.0 0.0 0.0 0.332 0.332 0.0000 0.332 0.650720 0.137 0.137000 0.0000 0.137000 0.268520 1.256 1.256 0.0000 1.256000 2.461760 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
28 3.451151e+06 4.265 4.265 0.0000 4.265000 9.280640 0.054 0.054000 0.0000 0.054000 0.117504 0.633 0.633000 0.0000 0.633000 1.377408 1.854 1.854 0.0000 1.854000 4.034304 0.0 0.0 0.0 0.0 0.0 0.332 0.332 0.0000 0.332 0.722432 0.137 0.137000 0.0000 0.137000 0.298112 1.256 1.256 0.0000 1.256000 2.733056 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
29 2.599000e+03 1.013 3.181 3.6146 3.542333 6.581384 0.001 0.036333 0.0434 0.042222 0.071502 0.256 0.507333 0.5576 0.549222 1.082445 0.213 1.307 1.5258 1.489333 2.633582 0.0 0.0 0.0 0.0 0.0 0.026 0.230 0.2708 0.264 0.459459 0.015 0.096333 0.1126 0.109889 0.193867 0.503 1.005 1.1054 1.088667 2.142834 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

Relationship between Revenue and Various Ad Spends for Camera Accessories

In [245]:
plt.figure(figsize=(20,10), dpi=80, facecolor='w', edgecolor='k', frameon='True')
cam_cor = cameraaccessory_ad.corr()
sns.heatmap(cam_cor, cmap="YlGnBu", annot=True)
plt.show()

Relationship between Revenue and Various Ad Spends for Gaming Accessories

In [246]:
plt.figure(figsize=(20,10), dpi=80, facecolor='w', edgecolor='k', frameon='True')
gam_cor = gamingaccessory_ad.corr()
sns.heatmap(gam_cor, cmap="YlGnBu", annot=True)
plt.show()

Relationship between Revenue and Various Ad Spends for Home Audio

In [247]:
plt.figure(figsize=(20,10), dpi=80, facecolor='w', edgecolor='k', frameon='True')
ha_cor = homeaudio_ad.corr()
sns.heatmap(ha_cor, cmap="YlGnBu", annot=True)
plt.show()

Building Linear Regression Models

Additive Model

Linear model assumes an additive relationship between the different KPIs. Hence their impacts are also additive towards the dependent Y variable.

The equation can be represented as:

                    Y = α + β1At + β2Pt + β3Dt + β4Qt + β5Tt + ϵ
In [248]:
# making a copy of original dataframes
cameraaccessory_org_df = cameraaccessory_df.copy()
gamingaccessory_org_df = gamingaccessory_df.copy()
homeaudio_org_df = homeaudio_df.copy()
homeaudio_org_df.head()
Out[248]:
Week gmv Discount% deliverybdays deliverycdays sla product_procurement_sla is_cod is_mass_market product_vertical_djcontroller product_vertical_dock product_vertical_dockingstation product_vertical_fmradio product_vertical_hifisystem product_vertical_homeaudiospeaker product_vertical_karaokeplayer product_vertical_slingbox product_vertical_soundmixer product_vertical_voicerecorder payday_week holiday_week Total Investment Total Investment_SMA_3 Total Investment_SMA_5 Total Investment_EMA_8 Total_Investment_Ad_Stock TV TV_SMA_3 TV_SMA_5 TV_EMA_8 TV_Ad_Stock Digital Digital_SMA_3 Digital_SMA_5 Digital_EMA_8 Digital_Ad_Stock Sponsorship Sponsorship_SMA_3 Sponsorship_SMA_5 Sponsorship_EMA_8 Sponsorship_Ad_Stock Content Marketing Content Marketing_SMA_3 Content Marketing_SMA_5 Content Marketing_EMA_8 Content_Marketing_Ad_Stock Online marketing Online marketing_SMA_3 Online marketing_SMA_5 Online marketing_EMA_8 Online_marketing_Ad_Stock Affiliates Affiliates_SMA_3 Affiliates_SMA_5 Affiliates_EMA_8 Affiliates_Ad_Stock SEM SEM_SMA_3 SEM_SMA_5 SEM_EMA_8 SEM_Ad_Stock Radio Radio_SMA_3 Radio_SMA_5 Radio_EMA_8 Radio_Ad_Stock Other Other_SMA_3 Other_SMA_5 Other_EMA_8 Other_Ad_Stock NPS NPS_SMA_3 NPS_SMA_5 Stock Index Stock Index_SMA_3 Stock Index_SMA_5 Max Temp Min Temp Mean Temp Heat Deg Days Cool Deg Days Total Rain (mm) Total Snow (cm) Total Precip (mm) Snow on Grnd (cm) Sale
25 28 4.573783e+06 31.450813 0.0 0.0 7.369201 2.863223 1583 1366 8 33 1 516.0 23 1374.0 0 0 0 63 0 0 4.265 0.000 0.0000 4.265000 4.265000 0.054 0.000000 0.0000 0.054000 0.054000 0.633 0.000000 0.0000 0.633000 0.633000 1.854 0.000 0.0000 1.854000 1.854000 0.0 0.0 0.0 0.0 0.0 0.332 0.000 0.0000 0.332 0.332000 0.137 0.000000 0.0000 0.137000 0.137000 1.256 0.000 0.0000 1.256000 1.256000 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 54.599588 0.000000 0.000000 1177.0 0.000000 0.0 28.0 12.5 20.100000 0.283333 2.383333 4.416667 0.0 4.416667 0.0 0
26 29 5.371525e+06 32.966657 0.0 0.0 6.984861 2.746318 1868 1610 7 50 1 574.0 42 1623.0 0 0 0 69 1 0 4.265 0.000 0.0000 4.265000 6.824000 0.054 0.000000 0.0000 0.054000 0.086400 0.633 0.000000 0.0000 0.633000 1.012800 1.854 0.000 0.0000 1.854000 2.966400 0.0 0.0 0.0 0.0 0.0 0.332 0.000 0.0000 0.332 0.531200 0.137 0.000000 0.0000 0.137000 0.219200 1.256 0.000 0.0000 1.256000 2.009600 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 54.599588 0.000000 0.000000 1177.0 0.000000 0.0 33.0 11.0 23.183333 0.000000 5.183333 1.400000 0.0 1.400000 0.0 2
27 30 4.679828e+06 32.357350 0.0 0.0 7.071749 2.860538 1758 1569 4 56 0 577.0 36 1430.0 0 0 0 46 0 0 4.265 4.265 0.0000 4.265000 8.359400 0.054 0.054000 0.0000 0.054000 0.105840 0.633 0.633000 0.0000 0.633000 1.240680 1.854 1.854 0.0000 1.854000 3.633840 0.0 0.0 0.0 0.0 0.0 0.332 0.332 0.0000 0.332 0.650720 0.137 0.137000 0.0000 0.137000 0.268520 1.256 1.256 0.0000 1.256000 2.461760 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 54.599588 54.599588 0.000000 1177.0 1177.000000 0.0 31.5 14.5 23.060000 0.000000 5.060000 1.080000 0.0 1.080000 0.0 0
28 31 3.451151e+06 32.207530 0.0 0.0 7.200750 2.734834 1244 1072 2 43 0 420.0 20 1025.0 0 0 0 44 1 0 4.265 4.265 0.0000 4.265000 9.280640 0.054 0.054000 0.0000 0.054000 0.117504 0.633 0.633000 0.0000 0.633000 1.377408 1.854 1.854 0.0000 1.854000 4.034304 0.0 0.0 0.0 0.0 0.0 0.332 0.332 0.0000 0.332 0.722432 0.137 0.137000 0.0000 0.137000 0.298112 1.256 1.256 0.0000 1.256000 2.733056 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 54.599588 54.599588 0.000000 1177.0 1177.000000 0.0 33.5 16.0 24.566667 0.000000 6.566667 4.633333 0.0 4.633333 0.0 0
29 32 2.599000e+03 16.130000 0.0 0.0 9.000000 2.000000 0 0 0 0 0 0.0 0 1.0 0 0 0 0 0 0 1.013 3.181 3.6146 3.542333 6.581384 0.001 0.036333 0.0434 0.042222 0.071502 0.256 0.507333 0.5576 0.549222 1.082445 0.213 1.307 1.5258 1.489333 2.633582 0.0 0.0 0.0 0.0 0.0 0.026 0.230 0.2708 0.264 0.459459 0.015 0.096333 0.1126 0.109889 0.193867 0.503 1.005 1.1054 1.088667 2.142834 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 59.987101 56.395426 55.677091 1206.0 1186.666667 1182.8 28.5 15.0 21.650000 0.000000 3.650000 0.350000 0.0 0.350000 0.0 0
In [249]:
# Checking for total count and percentage of null values in all columns of the dataframe.

total = pd.DataFrame(gamingaccessory_org_df.isnull().sum().sort_values(ascending=False), columns=['Total'])
percentage = pd.DataFrame(round(100*(gamingaccessory_org_df.isnull().sum()/gamingaccessory_org_df.shape[0]),2).sort_values(ascending=False)\
                          ,columns=['Percentage'])

pd.concat([total, percentage], axis = 1).head()
Out[249]:
Total Percentage
Sale 0 0.0
product_vertical_tvoutcableaccessory 0 0.0
holiday_week 0 0.0
Total Investment 0 0.0
Total Investment_SMA_3 0 0.0

Rescaling the Features of the 3 Dataframes

We will use Standard scaling.

We will drop the Week column as it is a row identifier and will not help in prediction of revenue

In [250]:
# removing columns
cameraaccessory_df = cameraaccessory_df.drop('Week', axis=1)
gamingaccessory_df = gamingaccessory_df.drop('Week', axis=1)
homeaudio_df = homeaudio_df.drop('Week', axis=1)
In [251]:
from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()

cameraaccessory_df[cameraaccessory_df.columns]=scaler.fit_transform(cameraaccessory_df[cameraaccessory_df.columns])
gamingaccessory_df[gamingaccessory_df.columns]=scaler.fit_transform(gamingaccessory_df[gamingaccessory_df.columns])
homeaudio_df[homeaudio_df.columns]=scaler.fit_transform(homeaudio_df[homeaudio_df.columns])

cameraaccessory_df.head()
Out[251]:
gmv Discount% deliverybdays deliverycdays sla product_procurement_sla is_cod is_mass_market product_vertical_cameraaccessory product_vertical_camerabag product_vertical_camerabattery product_vertical_camerabatterycharger product_vertical_camerabatterygrip product_vertical_cameraeyecup product_vertical_camerafilmrolls product_vertical_camerahousing product_vertical_cameraledlight product_vertical_cameramicrophone product_vertical_cameramount product_vertical_cameraremotecontrol product_vertical_cameratripod product_vertical_extensiontube product_vertical_filter product_vertical_flash product_vertical_flashshoeadapter product_vertical_lens product_vertical_reflectorumbrella product_vertical_softbox product_vertical_strap product_vertical_teleconverter product_vertical_telescope payday_week holiday_week Total Investment Total Investment_SMA_3 Total Investment_SMA_5 Total Investment_EMA_8 Total_Investment_Ad_Stock TV TV_SMA_3 TV_SMA_5 TV_EMA_8 TV_Ad_Stock Digital Digital_SMA_3 Digital_SMA_5 Digital_EMA_8 Digital_Ad_Stock Sponsorship Sponsorship_SMA_3 Sponsorship_SMA_5 Sponsorship_EMA_8 Sponsorship_Ad_Stock Content Marketing Content Marketing_SMA_3 Content Marketing_SMA_5 Content Marketing_EMA_8 Content_Marketing_Ad_Stock Online marketing Online marketing_SMA_3 Online marketing_SMA_5 Online marketing_EMA_8 Online_marketing_Ad_Stock Affiliates Affiliates_SMA_3 Affiliates_SMA_5 Affiliates_EMA_8 Affiliates_Ad_Stock SEM SEM_SMA_3 SEM_SMA_5 SEM_EMA_8 SEM_Ad_Stock Radio Radio_SMA_3 Radio_SMA_5 Radio_EMA_8 Radio_Ad_Stock Other Other_SMA_3 Other_SMA_5 Other_EMA_8 Other_Ad_Stock NPS NPS_SMA_3 NPS_SMA_5 Stock Index Stock Index_SMA_3 Stock Index_SMA_5 Max Temp Min Temp Mean Temp Heat Deg Days Cool Deg Days Total Rain (mm) Total Snow (cm) Total Precip (mm) Snow on Grnd (cm) Sale
25 -0.463986 -0.568488 -0.621962 -0.622705 0.975873 0.285022 -0.682393 -0.752433 0.406002 0.158350 0.000000 -0.413214 2.553918 -0.697481 -1.277445 -0.415704 -0.252646 -0.533081 -1.137094 0.707957 -0.429629 -0.879137 -0.032741 -1.205263 -0.204124 -0.439768 -0.193247 -0.428571 -1.309203 -0.142857 -0.997060 -1.083473 -0.531085 -1.138791 -1.631086 -1.699859 -1.633405 -1.649491 -1.251808 -1.419631 -1.491854 -1.652729 -1.518317 0.060467 -0.830498 -0.901105 0.077377 -0.554946 -0.868524 -1.264503 -1.318554 -1.299296 -1.278783 -0.696231 -0.794039 -0.910286 -1.146903 -0.916735 -1.849822 -2.030452 -2.004557 -1.933248 -2.030119 -1.887653 -2.094357 -2.029475 -1.901528 -2.051610 -0.283006 -1.038332 -1.119630 -0.486269 -0.870528 -0.464728 -0.568343 -0.656871 -0.816574 -0.666236 -0.444639 -0.540838 -0.63299 -0.787503 -0.637838 1.241485 -4.575500 -3.282764 0.138750 -4.698897 -3.338503 0.737609 1.051735 0.904298 -0.994318 0.392712 0.452202 -0.359762 0.347311 -0.269021 -0.478426
26 -0.261122 -0.524957 -0.621962 -0.622705 0.194883 0.370449 -0.599711 -0.648998 -0.266187 0.323447 0.215047 -0.399653 0.483174 -0.697481 -1.277445 -0.415704 -0.252646 -0.533081 -0.654105 0.728597 -0.287038 -0.879137 -0.306924 -1.203913 -0.204124 -0.324809 -0.193247 -0.428571 -0.505563 -0.142857 -0.647623 0.922958 -0.531085 -1.138791 -1.631086 -1.699859 -1.633405 -1.532676 -1.251808 -1.419631 -1.491854 -1.652729 -1.494622 0.060467 -0.830498 -0.901105 0.077377 -0.290548 -0.868524 -1.264503 -1.318554 -1.299296 -1.188393 -0.696231 -0.794039 -0.910286 -1.146903 -0.916735 -1.849822 -2.030452 -2.004557 -1.933248 -1.984615 -1.887653 -2.094357 -2.029475 -1.901528 -1.990966 -0.283006 -1.038332 -1.119630 -0.486269 -0.660863 -0.464728 -0.568343 -0.656871 -0.816574 -0.666236 -0.444639 -0.540838 -0.63299 -0.787503 -0.637838 1.241485 -4.575500 -3.282764 0.138750 -4.698897 -3.338503 1.269795 0.882827 1.256121 -1.034268 1.651885 -0.413172 -0.359762 -0.459971 -0.269021 0.814617
27 -0.271684 -0.611923 -0.620334 -0.619954 0.221648 0.511811 -0.586375 -0.662790 0.540440 -0.114420 0.287886 -0.114865 1.173422 -0.697481 -1.277445 -0.415704 -0.252646 -0.533081 -0.930099 0.026832 -0.358333 -0.879137 -0.306924 -1.198514 -0.204124 -0.178778 -0.193247 -0.428571 -0.578621 -0.142857 -0.181707 -1.083473 -0.531085 -1.138791 -1.202709 -1.699859 -1.633405 -1.462586 -1.251808 -1.330071 -1.491854 -1.652729 -1.480405 0.060467 0.098113 -0.901105 0.077377 -0.131910 -0.868524 -0.935221 -1.318554 -1.299296 -1.134160 -0.696231 -0.794039 -0.910286 -1.146903 -0.916735 -1.849822 -1.848811 -2.004557 -1.933248 -1.957312 -1.887653 -1.849877 -2.029475 -1.901528 -1.954580 -0.283006 -0.292774 -1.119630 -0.486269 -0.535064 -0.464728 -0.568343 -0.656871 -0.816574 -0.666236 -0.444639 -0.540838 -0.63299 -0.787503 -0.637838 1.241485 0.691337 -3.282764 0.138750 0.245144 -3.338503 1.110139 1.276946 1.242048 -1.034268 1.596422 -0.504969 -0.359762 -0.545605 -0.269021 -0.478426
28 -1.043536 -0.684735 -0.621962 -0.622705 0.161437 0.398084 -1.017789 -1.188585 0.338783 -0.904016 -0.714511 -0.704783 0.943339 -0.697481 -1.277445 -0.415704 -0.252646 -0.533081 -1.275091 -0.468531 -0.607869 -0.336460 -0.927869 -1.202563 -0.204124 -0.930679 -0.193247 -0.428571 -0.578621 -0.142857 -1.171778 0.922958 -0.531085 -1.138791 -1.202709 -1.699859 -1.633405 -1.420533 -1.251808 -1.330071 -1.491854 -1.652729 -1.471875 0.060467 0.098113 -0.901105 0.077377 -0.036726 -0.868524 -0.935221 -1.318554 -1.299296 -1.101619 -0.696231 -0.794039 -0.910286 -1.146903 -0.916735 -1.849822 -1.848811 -2.004557 -1.933248 -1.940931 -1.887653 -1.849877 -2.029475 -1.901528 -1.932749 -0.283006 -0.292774 -1.119630 -0.486269 -0.459584 -0.464728 -0.568343 -0.656871 -0.816574 -0.666236 -0.444639 -0.540838 -0.63299 -0.787503 -0.637838 1.241485 0.691337 -3.282764 0.138750 0.245144 -3.338503 1.323014 1.445854 1.413966 -1.034268 2.273977 0.514356 -0.359762 0.405292 -0.269021 -0.478426
29 -2.407140 -4.280821 -0.621962 -0.622705 3.801387 3.533425 -1.954629 -2.214892 -2.686066 -2.231974 -2.382860 -1.925303 -1.587571 -0.697481 -1.277445 -0.415704 -0.252646 -0.533081 -1.413087 -2.202304 -1.461044 -0.879137 -1.806869 -1.240358 -0.204124 -1.959104 -0.193247 -0.428571 -2.258960 -0.142857 -1.928891 -1.083473 -0.531085 -1.440518 -1.311586 -1.312175 -1.735007 -1.543751 -1.333579 -1.359372 -1.415150 -1.677700 -1.505517 -0.423320 -0.086240 0.032606 -0.116650 -0.242065 -1.131096 -1.032371 -1.024391 -1.394036 -1.215437 -0.696231 -0.794039 -0.910286 -1.146903 -0.916735 -2.015055 -1.904617 -1.856367 -1.973751 -2.001003 -2.106313 -1.922448 -1.831989 -1.953187 -2.009656 -0.679831 -0.441767 -0.385316 -0.640350 -0.623795 -0.464728 -0.568343 -0.656871 -0.816574 -0.666236 -0.444639 -0.540838 -0.63299 -0.787503 -0.637838 2.562141 0.864569 0.744545 0.506552 0.285749 0.346561 0.790828 1.333249 1.081161 -1.034268 0.962338 -0.714380 -0.359762 -0.740958 -0.269021 -0.478426

Splitting the 3 Dataframes into Training and Testing Sets

As you know, the first basic step for regression is performing a train-test split.

In [252]:
from sklearn.model_selection import train_test_split

# We specify this so that the train and test data set always have the same rows, respectively

cameraaccessory_train, cameraaccessory_test = train_test_split(cameraaccessory_df, \
                                                               train_size = 0.7, test_size = 0.3, random_state = 100)

gamingaccessory_train, gamingaccessory_test = train_test_split(gamingaccessory_df, \
                                                               train_size = 0.7, test_size = 0.3, random_state = 100)

homeaudio_train, homeaudio_test = train_test_split(homeaudio_df, \
                                                               train_size = 0.7, test_size = 0.3, random_state = 100)

Dividing the 3 dataframes into X and Y sets for the model building

In [253]:
y_cameraaccessory_train = cameraaccessory_train.pop('gmv')
X_cameraaccessory_train = cameraaccessory_train

y_gamingaccessory_train = gamingaccessory_train.pop('gmv')
X_gamingaccessory_train = gamingaccessory_train

y_homeaudio_train = homeaudio_train.pop('gmv')
X_homeaudio_train = homeaudio_train

X_cameraaccessory_train.head()
Out[253]:
Discount% deliverybdays deliverycdays sla product_procurement_sla is_cod is_mass_market product_vertical_cameraaccessory product_vertical_camerabag product_vertical_camerabattery product_vertical_camerabatterycharger product_vertical_camerabatterygrip product_vertical_cameraeyecup product_vertical_camerafilmrolls product_vertical_camerahousing product_vertical_cameraledlight product_vertical_cameramicrophone product_vertical_cameramount product_vertical_cameraremotecontrol product_vertical_cameratripod product_vertical_extensiontube product_vertical_filter product_vertical_flash product_vertical_flashshoeadapter product_vertical_lens product_vertical_reflectorumbrella product_vertical_softbox product_vertical_strap product_vertical_teleconverter product_vertical_telescope payday_week holiday_week Total Investment Total Investment_SMA_3 Total Investment_SMA_5 Total Investment_EMA_8 Total_Investment_Ad_Stock TV TV_SMA_3 TV_SMA_5 TV_EMA_8 TV_Ad_Stock Digital Digital_SMA_3 Digital_SMA_5 Digital_EMA_8 Digital_Ad_Stock Sponsorship Sponsorship_SMA_3 Sponsorship_SMA_5 Sponsorship_EMA_8 Sponsorship_Ad_Stock Content Marketing Content Marketing_SMA_3 Content Marketing_SMA_5 Content Marketing_EMA_8 Content_Marketing_Ad_Stock Online marketing Online marketing_SMA_3 Online marketing_SMA_5 Online marketing_EMA_8 Online_marketing_Ad_Stock Affiliates Affiliates_SMA_3 Affiliates_SMA_5 Affiliates_EMA_8 Affiliates_Ad_Stock SEM SEM_SMA_3 SEM_SMA_5 SEM_EMA_8 SEM_Ad_Stock Radio Radio_SMA_3 Radio_SMA_5 Radio_EMA_8 Radio_Ad_Stock Other Other_SMA_3 Other_SMA_5 Other_EMA_8 Other_Ad_Stock NPS NPS_SMA_3 NPS_SMA_5 Stock Index Stock Index_SMA_3 Stock Index_SMA_5 Max Temp Min Temp Mean Temp Heat Deg Days Cool Deg Days Total Rain (mm) Total Snow (cm) Total Precip (mm) Snow on Grnd (cm) Sale
0 0.060657 -0.621962 -0.622705 -0.497921 0.553036 0.893232 0.627852 0.540440 -0.222092 -0.381535 -0.155549 -0.897323 0.381098 -0.169421 -0.415704 3.958114 -0.533081 0.656865 -0.385971 0.380768 0.206217 -0.105319 1.488997 -0.204124 0.010750 -0.193247 -0.428571 -0.359446 -0.142857 -0.123468 -1.083473 -0.531085 0.186504 0.776833 0.987797 0.863692 0.704290 0.354300 0.677298 0.802561 0.729798 0.657683 -0.605542 -0.025604 0.163219 0.064486 -0.158850 -0.997170 0.476156 0.908352 0.606863 0.225891 0.307176 0.494556 0.605060 0.628475 0.480500 1.062276 1.065636 1.085006 0.986779 1.059869 1.168223 1.032135 1.011728 0.949932 1.043430 -0.391566 0.275496 0.505240 0.448532 0.169488 3.459358 1.031321 0.382238 0.681314 1.525878 3.205892 0.939272 0.342332 0.586754 1.393930 -0.598617 -0.115920 0.048808 -1.446603 -0.319129 -0.095847 -1.071822 -1.932310 -1.381597 1.494331 -0.679084 1.441884 2.654145 1.894972 -0.269021 -0.478426
1 -0.040097 -0.620341 -0.621199 0.437612 -0.028840 0.045075 -0.197331 0.540440 -0.336942 -0.770007 -0.365749 0.023008 0.021572 0.152264 -0.415704 -0.252646 -0.533081 0.035879 -0.633653 -0.087409 0.206217 -0.048869 0.240405 -0.204124 -0.377628 -0.193247 -0.428571 0.371136 -0.142857 -0.356426 0.922958 -0.531085 0.186504 0.504406 0.813250 0.755053 0.531642 0.354300 0.536876 0.712778 0.690811 0.572284 -0.605542 -0.344432 -0.055139 -0.199758 -0.414193 -0.997170 -0.300930 0.402233 0.136855 -0.350919 0.307176 0.423250 0.555838 0.610571 0.453296 1.062276 1.083691 1.095841 1.051716 1.101323 1.168223 1.112439 1.059082 1.038233 1.123928 -0.391566 -0.069779 0.273401 0.198646 -0.094166 3.459358 2.630985 1.421348 1.846339 2.841147 3.205892 2.419383 1.317654 1.655621 2.612992 -0.598617 -0.074344 0.067514 -1.446603 -0.299527 -0.087123 -1.763663 -1.707099 -1.815194 2.030130 -0.679084 -0.088059 -0.359762 -0.156683 6.323587 -0.478426
10 0.043235 -0.420785 -0.405268 -0.666914 -0.470323 0.223108 0.396272 -0.736719 -0.164667 1.359652 1.485374 -0.437157 0.021572 0.188007 -0.415704 -0.252646 2.730682 0.656865 -0.117649 -0.296544 0.748895 -0.355309 0.486074 -0.204124 -0.455304 -0.193247 -0.428571 0.955601 -0.142857 0.983082 -1.083473 1.882938 0.785597 0.880521 0.321637 0.463286 0.648741 2.252007 2.436440 1.339139 1.533744 1.997519 -0.078124 -0.060323 -0.117096 -0.335989 -0.140840 0.498420 0.582069 0.064963 0.024128 0.330126 -0.281489 -0.320366 -0.304382 -0.242232 -0.323380 0.459661 0.491172 0.379514 0.558372 0.464717 0.652040 0.678799 0.512928 0.679576 0.625789 -0.260872 -0.267843 -0.343200 -0.507431 -0.362411 0.802606 0.981554 0.349910 0.908442 0.801507 1.703747 2.072356 1.088980 1.664670 1.780036 -0.126119 0.153167 0.300867 -1.915867 -0.435344 0.081739 -0.060669 -0.749952 -0.736526 0.697211 -0.679084 2.857082 -0.359762 2.590751 -0.269021 -0.478426
42 -0.302631 -0.621177 -0.621909 0.519808 0.584108 0.019070 -0.123203 -1.140032 -0.042638 -0.152614 -0.481021 -0.207074 -0.697481 -0.133678 -0.415704 -0.252646 -0.533081 -0.102118 1.244601 0.150244 -0.879137 0.515626 -0.461507 -0.204124 0.200279 -0.193247 -0.428571 0.005845 -0.142857 -0.298186 0.922958 -0.531085 -0.584141 0.478995 1.477093 1.126800 0.537489 -0.032955 0.362731 0.733633 0.480230 0.381794 -0.424603 0.962176 2.440563 1.826575 1.036395 -0.711717 0.324244 1.348612 1.206008 0.416308 -0.544605 0.783174 2.196583 1.603611 0.842346 0.083297 0.507586 0.852323 0.529375 0.509303 0.230849 0.511648 0.730057 0.448902 0.506230 -0.394728 0.954176 2.339231 1.810066 1.030520 -0.464728 -0.568343 -0.656871 -0.816574 -0.666236 -0.444639 -0.540838 -0.632990 -0.787503 -0.637838 -0.621422 -0.125394 0.003992 0.848988 0.448170 0.459966 -0.433199 -0.524742 -0.381280 0.258231 -0.679084 0.657788 -0.359762 0.539096 -0.269021 3.400704
32 -0.575243 -0.621962 -0.622705 0.369207 0.525634 -0.575707 -0.673133 0.204345 -0.444615 0.225453 -0.745467 0.023008 -0.697481 -0.777047 -0.415704 -0.252646 -0.533081 -0.102118 0.357075 -0.156328 -0.879137 -0.395631 -0.991990 -0.204124 -0.619976 -0.193247 -0.428571 -1.236145 -0.142857 -0.006989 0.922958 -0.531085 -1.440518 -1.529340 -1.591209 -1.960478 -1.704624 -1.333579 -1.417973 -1.490087 -1.733114 -1.549440 -0.423320 -0.454946 -0.472428 -0.547230 -0.510155 -1.131096 -1.226673 -1.277489 -1.604282 -1.364037 -0.696231 -0.794039 -0.910286 -1.146903 -0.916735 -2.015055 -2.016227 -1.990329 -2.063634 -2.079433 -2.106313 -2.067589 -2.003167 -2.067828 -2.110066 -0.679831 -0.739753 -0.785488 -0.982282 -0.838188 -0.464728 -0.568343 -0.656871 -0.816574 -0.666236 -0.444639 -0.540838 -0.632990 -0.787503 -0.637838 2.562141 1.211033 1.056302 0.506552 0.366960 0.418841 1.163358 1.614762 1.401794 -1.034268 2.226009 2.662014 -0.359762 2.408778 -0.269021 -0.478426

Dividing into X and Y test sets for the model building for 3 dataframes

In [254]:
y_cameraaccessory_test = cameraaccessory_test.pop('gmv')
X_cameraaccessory_test = cameraaccessory_test

y_gamingaccessory_test = gamingaccessory_test.pop('gmv')
X_gamingaccessory_test = gamingaccessory_test

y_homeaudio_test = homeaudio_test.pop('gmv')
X_homeaudio_test = homeaudio_test

X_cameraaccessory_test.head()
Out[254]:
Discount% deliverybdays deliverycdays sla product_procurement_sla is_cod is_mass_market product_vertical_cameraaccessory product_vertical_camerabag product_vertical_camerabattery product_vertical_camerabatterycharger product_vertical_camerabatterygrip product_vertical_cameraeyecup product_vertical_camerafilmrolls product_vertical_camerahousing product_vertical_cameraledlight product_vertical_cameramicrophone product_vertical_cameramount product_vertical_cameraremotecontrol product_vertical_cameratripod product_vertical_extensiontube product_vertical_filter product_vertical_flash product_vertical_flashshoeadapter product_vertical_lens product_vertical_reflectorumbrella product_vertical_softbox product_vertical_strap product_vertical_teleconverter product_vertical_telescope payday_week holiday_week Total Investment Total Investment_SMA_3 Total Investment_SMA_5 Total Investment_EMA_8 Total_Investment_Ad_Stock TV TV_SMA_3 TV_SMA_5 TV_EMA_8 TV_Ad_Stock Digital Digital_SMA_3 Digital_SMA_5 Digital_EMA_8 Digital_Ad_Stock Sponsorship Sponsorship_SMA_3 Sponsorship_SMA_5 Sponsorship_EMA_8 Sponsorship_Ad_Stock Content Marketing Content Marketing_SMA_3 Content Marketing_SMA_5 Content Marketing_EMA_8 Content_Marketing_Ad_Stock Online marketing Online marketing_SMA_3 Online marketing_SMA_5 Online marketing_EMA_8 Online_marketing_Ad_Stock Affiliates Affiliates_SMA_3 Affiliates_SMA_5 Affiliates_EMA_8 Affiliates_Ad_Stock SEM SEM_SMA_3 SEM_SMA_5 SEM_EMA_8 SEM_Ad_Stock Radio Radio_SMA_3 Radio_SMA_5 Radio_EMA_8 Radio_Ad_Stock Other Other_SMA_3 Other_SMA_5 Other_EMA_8 Other_Ad_Stock NPS NPS_SMA_3 NPS_SMA_5 Stock Index Stock Index_SMA_3 Stock Index_SMA_5 Max Temp Min Temp Mean Temp Heat Deg Days Cool Deg Days Total Rain (mm) Total Snow (cm) Total Precip (mm) Snow on Grnd (cm) Sale
31 -0.928400 -0.621962 -0.622705 4.262882 -1.795333 -1.954629 -2.215467 -2.686066 -2.231974 -2.393266 -1.925303 -1.587571 -0.697481 -1.277445 -0.415704 -0.252646 -0.533081 -1.413087 -2.202304 -1.463421 -0.879137 -1.806869 -1.240358 -0.204124 -1.949783 -0.193247 -0.428571 -2.258960 -0.142857 -1.928891 0.922958 -0.531085 -1.440518 -1.529340 -1.521451 -1.923297 -1.688655 -1.333579 -1.417973 -1.471353 -1.723976 -1.545080 -0.423320 -0.454946 -0.346170 -0.476226 -0.483543 -1.131096 -1.226673 -1.214215 -1.569611 -1.349286 -0.696231 -0.794039 -0.910286 -1.146903 -0.916735 -2.015055 -2.016227 -1.956838 -2.048812 -2.071647 -2.106313 -2.067589 -1.960372 -2.048923 -2.100098 -0.679831 -0.739753 -0.685445 -0.925896 -0.816907 -0.464728 -0.568343 -0.656871 -0.816574 -0.666236 -0.444639 -0.540838 -0.632990 -0.787503 -0.637838 2.562141 1.211033 0.978363 0.506552 0.366960 0.400771 0.844047 1.220643 0.921414 -0.928518 0.670030 -0.814782 -0.359762 -0.834621 -0.269021 1.461139
11 0.167884 0.429363 0.433071 -0.840190 -0.549591 -0.322326 0.691637 -0.871157 0.215775 1.383932 2.136318 -0.667240 0.381098 0.330978 -0.415704 -0.252646 3.274642 -0.447110 0.191953 -0.118304 0.748895 0.015645 0.711495 -0.204124 -0.138387 -0.193247 1.000000 0.078903 -0.142857 0.575406 0.922958 1.882938 0.785597 0.880521 0.651896 0.645360 0.793068 2.252007 2.436440 1.978224 1.895612 2.275711 -0.078124 -0.060323 -0.069539 -0.299714 -0.117269 0.498420 0.582069 0.375435 0.223249 0.471124 -0.281489 -0.320366 -0.334853 -0.287599 -0.340957 0.459661 0.491172 0.448575 0.570794 0.489299 0.652040 0.678799 0.604480 0.706007 0.660870 -0.260872 -0.267843 -0.300286 -0.494134 -0.344308 0.802606 0.981554 0.685504 1.008867 0.922381 1.703747 2.072356 1.662970 1.928511 2.008609 -0.126119 0.153167 0.281961 -1.915867 -0.435344 -0.047244 -0.539636 -0.918861 -0.605142 0.534860 -0.679084 -0.003366 0.341433 0.067598 -0.269021 -0.478426
12 0.508729 1.718407 1.722050 -0.439781 -0.662751 -1.803934 0.696234 -0.736719 0.158350 0.274012 0.590325 -0.437157 0.381098 0.116521 -0.415704 -0.252646 2.186721 0.449870 -0.385971 0.288083 0.748895 -0.298860 0.851877 -0.204124 0.094640 -0.193247 -0.428571 0.663368 -0.142857 0.750124 -1.083473 -0.531085 -0.215890 0.519137 0.750613 0.449739 0.386930 0.672128 1.870330 2.255349 1.694611 1.693753 -0.472083 -0.210446 -0.124798 -0.429500 -0.316844 -0.192812 0.326316 0.519335 0.128713 0.204696 -0.656095 -0.462977 -0.463768 -0.463490 -0.550000 0.200472 0.403635 0.465101 0.516921 0.394400 0.417248 0.600874 0.650081 0.671094 0.585272 -0.388931 -0.315924 -0.289657 -0.533515 -0.401053 -0.464728 0.464922 0.685504 0.603214 0.286934 -0.444639 1.201291 1.662970 1.324952 0.950030 0.561982 0.243426 0.303664 0.963134 -0.117503 -0.034782 -1.018603 -1.312980 -1.246573 1.327481 -0.679084 -0.140651 0.350659 -0.058560 -0.269021 -0.478426
3 0.295361 -0.620596 -0.621167 0.119765 0.146536 0.927239 0.713473 0.473221 0.222953 0.870594 -0.223355 -0.437157 -0.337955 0.045036 -0.415704 -0.252646 -0.533081 2.036833 -0.613013 -0.123057 1.291572 0.209186 1.325667 -0.204124 -0.122852 -0.193247 -0.428571 0.444194 -0.142857 0.575406 -1.083473 -0.531085 -0.642872 -0.067300 0.097861 0.274441 -0.079447 -0.537468 0.076911 0.239120 0.346236 0.049150 -0.260347 -0.531719 -0.620121 -0.551019 -0.527211 -0.789960 -1.001350 -1.066189 -0.659412 -0.924021 -0.165540 0.171982 0.283943 0.399958 0.170819 0.118935 0.783146 0.937144 0.940821 0.750998 0.184249 0.866174 1.008571 0.969410 0.813565 -0.431091 -0.429895 -0.432083 -0.279795 -0.425090 -0.464728 2.630985 3.499566 2.507501 2.195788 -0.444639 2.419383 3.268299 2.262212 2.014839 0.194234 0.071232 0.170422 0.709477 -0.041893 0.044976 -0.858948 -0.975163 -1.024069 1.052531 -0.679084 0.160557 -0.359762 0.075243 -0.269021 0.814617
18 0.575176 1.648847 1.632149 -1.061468 -0.424510 0.767876 0.546827 0.069908 0.890521 0.156083 0.834429 -0.897323 -0.337955 2.225343 0.678254 -0.252646 -0.533081 -0.240114 0.047472 -0.455771 -0.336460 -0.153704 0.901821 -0.204124 0.411557 -0.193247 2.428571 0.955601 -0.142857 0.924843 0.922958 -0.531085 -0.086089 -0.063115 -0.085524 0.020782 -0.068683 -0.912380 -0.965195 -0.281551 -0.006071 -0.610549 -0.554212 -0.604580 -0.600362 -0.861674 -0.689565 -0.151370 -0.139190 -0.116995 -0.093285 -0.135051 0.008384 0.010696 -0.333681 -0.348556 -0.158826 0.524998 0.557372 0.452296 0.559653 0.527779 0.304333 0.332600 0.399979 0.512389 0.386069 -0.216605 -0.217981 -0.288462 -0.481405 -0.303883 0.814233 0.995773 0.359147 0.660978 0.778590 0.092727 0.112788 -0.202282 0.162155 0.022443 -0.546179 -0.012132 0.269903 0.785574 0.459372 0.504830 -0.007451 0.094589 0.025693 -0.244668 -0.679084 -0.728723 -0.359762 -0.754339 -0.269021 -0.478426

Building Linear Regression model for cameraaccessory

In [255]:
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
from sklearn.metrics import mean_squared_error

cameraaccessory_model = LinearRegression().fit(X_cameraaccessory_train, y_cameraaccessory_train)
y_cameraaccessory_test_pred = cameraaccessory_model.predict(X_cameraaccessory_test)

print('R2 Score: {}'.format(r2_score(y_cameraaccessory_test, y_cameraaccessory_test_pred)))
print('Mean Squared Error: {}'.format(mean_squared_error(y_cameraaccessory_test, y_cameraaccessory_test_pred)))
R2 Score: 0.8277391712124689
Mean Squared Error: 0.17048591864652593
With Simple Linear Regression, we get a r2 score of 0.83 and mse of 0.17

Building Linear Regression model for cameraaccessory using K-fold Cross Validation

We will use GridSearchCV method and 5 fold cross validation method for our linear regression.

In [256]:
y_cameraaccessory = cameraaccessory_df.pop('gmv')
X_cameraaccessory = cameraaccessory_df
In [257]:
# Make cross validated predictions
from sklearn.model_selection import cross_val_score,cross_val_predict
from sklearn import metrics

cameraaccessory_model_cv = LinearRegression().fit(X_cameraaccessory, y_cameraaccessory)
cameraaccessory_predictions_cv = cross_val_predict(cameraaccessory_model_cv, X_cameraaccessory, y_cameraaccessory, cv=10)
accuracy = metrics.r2_score(y_cameraaccessory, cameraaccessory_predictions_cv)
print("Cross-Predicted Accuracy:", accuracy)
print('Mean Squared Error: {}'.format(mean_squared_error(y_cameraaccessory, cameraaccessory_predictions_cv)))
Cross-Predicted Accuracy: -0.07749313506669542
Mean Squared Error: 1.0774931350666956
With Simple Linear Regression, using cross validation, we get a r2 score of -0.8 and mse score of 1.08

Here R2 score is negative which signifies that the chosen model does not follow the trend of the data, so it fits worse than a horizontal line. It simply means the chosen model (with its constraints) fits the data really poorly.

Determining Feature Importance for cameraaccessory from model without cv

In [258]:
# linear regression model parameters
#Limiting floats output to 3 decimal points
pd.set_option('display.float_format', lambda x: '{:.3f}'.format(x)) 
pd.set_option('display.precision',1)


cameraaccessory_lr_model_parameters = list(cameraaccessory_model.coef_)
cameraaccessory_lr_model_parameters.insert(0, cameraaccessory_model.intercept_)
cameraaccessory_lr_model_parameters = [round(x, 3) for x in cameraaccessory_lr_model_parameters]
cols = X_cameraaccessory_test.columns
cols = cols.insert(0, "constant")
cameraaccessory_lr_coef = list(zip(cols, cameraaccessory_lr_model_parameters))
cameraaccessory_lr_coef
Out[258]:
[('constant', -0.043),
 ('Discount%', 0.015),
 ('deliverybdays', -0.047),
 ('deliverycdays', -0.047),
 ('sla', 0.005),
 ('product_procurement_sla', 0.035),
 ('is_cod', 0.032),
 ('is_mass_market', 0.074),
 ('product_vertical_cameraaccessory', 0.042),
 ('product_vertical_camerabag', 0.182),
 ('product_vertical_camerabattery', 0.2),
 ('product_vertical_camerabatterycharger', -0.072),
 ('product_vertical_camerabatterygrip', 0.09),
 ('product_vertical_cameraeyecup', -0.031),
 ('product_vertical_camerafilmrolls', -0.138),
 ('product_vertical_camerahousing', 0.171),
 ('product_vertical_cameraledlight', -0.037),
 ('product_vertical_cameramicrophone', -0.06),
 ('product_vertical_cameramount', 0.11),
 ('product_vertical_cameraremotecontrol', -0.108),
 ('product_vertical_cameratripod', -0.018),
 ('product_vertical_extensiontube', 0.065),
 ('product_vertical_filter', 0.105),
 ('product_vertical_flash', 0.018),
 ('product_vertical_flashshoeadapter', -0.007),
 ('product_vertical_lens', 0.432),
 ('product_vertical_reflectorumbrella', 0.004),
 ('product_vertical_softbox', 0.037),
 ('product_vertical_strap', 0.064),
 ('product_vertical_teleconverter', -0.0),
 ('product_vertical_telescope', 0.016),
 ('payday_week', -0.043),
 ('holiday_week', -0.068),
 ('Total Investment', -0.009),
 ('Total Investment_SMA_3', 0.014),
 ('Total Investment_SMA_5', -0.011),
 ('Total Investment_EMA_8', -0.004),
 ('Total_Investment_Ad_Stock', -0.009),
 ('TV', -0.084),
 ('TV_SMA_3', -0.036),
 ('TV_SMA_5', 0.058),
 ('TV_EMA_8', 0.032),
 ('TV_Ad_Stock', -0.007),
 ('Digital', -0.052),
 ('Digital_SMA_3', -0.045),
 ('Digital_SMA_5', 0.061),
 ('Digital_EMA_8', -0.009),
 ('Digital_Ad_Stock', -0.047),
 ('Sponsorship', -0.036),
 ('Sponsorship_SMA_3', 0.075),
 ('Sponsorship_SMA_5', -0.073),
 ('Sponsorship_EMA_8', -0.033),
 ('Sponsorship_Ad_Stock', -0.006),
 ('Content Marketing', 0.051),
 ('Content Marketing_SMA_3', -0.001),
 ('Content Marketing_SMA_5', 0.04),
 ('Content Marketing_EMA_8', 0.002),
 ('Content_Marketing_Ad_Stock', 0.008),
 ('Online marketing', 0.151),
 ('Online marketing_SMA_3', -0.043),
 ('Online marketing_SMA_5', 0.013),
 ('Online marketing_EMA_8', 0.03),
 ('Online_marketing_Ad_Stock', 0.021),
 ('Affiliates', 0.107),
 ('Affiliates_SMA_3', -0.063),
 ('Affiliates_SMA_5', 0.026),
 ('Affiliates_EMA_8', 0.033),
 ('Affiliates_Ad_Stock', 0.011),
 ('SEM', 0.019),
 ('SEM_SMA_3', -0.02),
 ('SEM_SMA_5', 0.052),
 ('SEM_EMA_8', 0.009),
 ('SEM_Ad_Stock', -0.016),
 ('Radio', 0.01),
 ('Radio_SMA_3', -0.046),
 ('Radio_SMA_5', 0.037),
 ('Radio_EMA_8', 0.007),
 ('Radio_Ad_Stock', -0.013),
 ('Other', -0.087),
 ('Other_SMA_3', -0.05),
 ('Other_SMA_5', 0.056),
 ('Other_EMA_8', 0.01),
 ('Other_Ad_Stock', -0.029),
 ('NPS', 0.034),
 ('NPS_SMA_3', 0.095),
 ('NPS_SMA_5', 0.006),
 ('Stock Index', -0.13),
 ('Stock Index_SMA_3', -0.121),
 ('Stock Index_SMA_5', -0.026),
 ('Max Temp', 0.061),
 ('Min Temp', -0.09),
 ('Mean Temp', 0.078),
 ('Heat Deg Days', -0.1),
 ('Cool Deg Days', -0.011),
 ('Total Rain (mm)', 0.032),
 ('Total Snow (cm)', -0.024),
 ('Total Precip (mm)', 0.025),
 ('Snow on Grnd (cm)', 0.024),
 ('Sale', 0.011)]
In [259]:
cameraaccessory_lr_coef_df = pd.DataFrame(cameraaccessory_lr_coef)
col_rename = {0:'Features',1: 'Coefficients'}
cameraaccessory_lr_coef_df = cameraaccessory_lr_coef_df.rename(columns=col_rename)
cameraaccessory_lr_coef_df = cameraaccessory_lr_coef_df.iloc[1:,:]
cameraaccessory_lr_coef_df = cameraaccessory_lr_coef_df.loc[cameraaccessory_lr_coef_df['Coefficients']!=0.0]
cameraaccessory_lr_coef_df = cameraaccessory_lr_coef_df.sort_values(by=['Coefficients'], ascending = False)
cameraaccessory_lr_coef_df
Out[259]:
Features Coefficients
25 product_vertical_lens 0.432
10 product_vertical_camerabattery 0.200
9 product_vertical_camerabag 0.182
15 product_vertical_camerahousing 0.171
58 Online marketing 0.151
18 product_vertical_cameramount 0.110
63 Affiliates 0.107
22 product_vertical_filter 0.105
84 NPS_SMA_3 0.095
12 product_vertical_camerabatterygrip 0.090
91 Mean Temp 0.078
49 Sponsorship_SMA_3 0.075
7 is_mass_market 0.074
21 product_vertical_extensiontube 0.065
28 product_vertical_strap 0.064
45 Digital_SMA_5 0.061
89 Max Temp 0.061
40 TV_SMA_5 0.058
80 Other_SMA_5 0.056
70 SEM_SMA_5 0.052
53 Content Marketing 0.051
8 product_vertical_cameraaccessory 0.042
55 Content Marketing_SMA_5 0.040
75 Radio_SMA_5 0.037
27 product_vertical_softbox 0.037
5 product_procurement_sla 0.035
83 NPS 0.034
66 Affiliates_EMA_8 0.033
41 TV_EMA_8 0.032
94 Total Rain (mm) 0.032
6 is_cod 0.032
61 Online marketing_EMA_8 0.030
65 Affiliates_SMA_5 0.026
96 Total Precip (mm) 0.025
97 Snow on Grnd (cm) 0.024
62 Online_marketing_Ad_Stock 0.021
68 SEM 0.019
23 product_vertical_flash 0.018
30 product_vertical_telescope 0.016
1 Discount% 0.015
34 Total Investment_SMA_3 0.014
60 Online marketing_SMA_5 0.013
67 Affiliates_Ad_Stock 0.011
98 Sale 0.011
81 Other_EMA_8 0.010
73 Radio 0.010
71 SEM_EMA_8 0.009
57 Content_Marketing_Ad_Stock 0.008
76 Radio_EMA_8 0.007
85 NPS_SMA_5 0.006
4 sla 0.005
26 product_vertical_reflectorumbrella 0.004
56 Content Marketing_EMA_8 0.002
54 Content Marketing_SMA_3 -0.001
36 Total Investment_EMA_8 -0.004
52 Sponsorship_Ad_Stock -0.006
24 product_vertical_flashshoeadapter -0.007
42 TV_Ad_Stock -0.007
37 Total_Investment_Ad_Stock -0.009
46 Digital_EMA_8 -0.009
33 Total Investment -0.009
35 Total Investment_SMA_5 -0.011
93 Cool Deg Days -0.011
77 Radio_Ad_Stock -0.013
72 SEM_Ad_Stock -0.016
20 product_vertical_cameratripod -0.018
69 SEM_SMA_3 -0.020
95 Total Snow (cm) -0.024
88 Stock Index_SMA_5 -0.026
82 Other_Ad_Stock -0.029
13 product_vertical_cameraeyecup -0.031
51 Sponsorship_EMA_8 -0.033
39 TV_SMA_3 -0.036
48 Sponsorship -0.036
16 product_vertical_cameraledlight -0.037
31 payday_week -0.043
59 Online marketing_SMA_3 -0.043
44 Digital_SMA_3 -0.045
74 Radio_SMA_3 -0.046
47 Digital_Ad_Stock -0.047
3 deliverycdays -0.047
2 deliverybdays -0.047
79 Other_SMA_3 -0.050
43 Digital -0.052
17 product_vertical_cameramicrophone -0.060
64 Affiliates_SMA_3 -0.063
32 holiday_week -0.068
11 product_vertical_camerabatterycharger -0.072
50 Sponsorship_SMA_5 -0.073
38 TV -0.084
78 Other -0.087
90 Min Temp -0.090
92 Heat Deg Days -0.100
19 product_vertical_cameraremotecontrol -0.108
87 Stock Index_SMA_3 -0.121
86 Stock Index -0.130
14 product_vertical_camerafilmrolls -0.138

Plotting the Features in descending order of Importance for cameraaccessory

In [260]:
# Slightly alter the figure size to make it more horizontal.
plt.figure(figsize=(10, 15), dpi=100, facecolor='w', edgecolor='k', frameon='True')
sns.barplot(y='Features', x='Coefficients', palette='husl', data=cameraaccessory_lr_coef_df, estimator=np.sum)
# Automatically adjust subplot params so that the subplotS fits in to the figure area.
plt.tight_layout()

# display the plot
plt.show()
The 5 most important features affecting GMV(Revenue) for cameraaccessory are:
Features Coefficients
product_vertical_lens 0.432
product_vertical_camerabattery 0.200
product_vertical_camerabag 0.182
product_vertical_camerahousing 0.171
Online marketing 0.151

Building Linear Regression model for gamingaccessory

In [261]:
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
from sklearn.metrics import mean_squared_error

gamingaccessory_model = LinearRegression().fit(X_gamingaccessory_train, y_gamingaccessory_train)
y_gamingaccessory_test_pred = gamingaccessory_model.predict(X_gamingaccessory_test)

print('R2 Score: {}'.format(r2_score(y_gamingaccessory_test, y_gamingaccessory_test_pred)))
print('Mean Squared Error: {}'.format(mean_squared_error(y_gamingaccessory_test, y_gamingaccessory_test_pred)))
R2 Score: 0.9334637176804224
Mean Squared Error: 0.05325157876230199
With Simple Linear Regression, we get a r2 score of 0.93 and mse of 0.05

Building Linear Regression model for gamingaccessory using K-fold Cross Validation

We will use GridSearchCV method and 5 fold cross validation method for our linear regression.

In [262]:
y_gamingaccessory = gamingaccessory_df.pop('gmv')
X_gamingaccessory = gamingaccessory_df
In [263]:
# Make cross validated predictions
from sklearn.model_selection import cross_val_score,cross_val_predict
from sklearn import metrics

gamingaccessory_model_cv = LinearRegression().fit(X_gamingaccessory, y_gamingaccessory)
gamingaccessory_predictions_cv = cross_val_predict(gamingaccessory_model_cv, X_gamingaccessory, y_gamingaccessory, cv=10)
accuracy = metrics.r2_score(y_gamingaccessory, gamingaccessory_predictions_cv)
print("Cross-Predicted Accuracy:", accuracy)
print('Mean Squared Error: {}'.format(mean_squared_error(y_gamingaccessory, gamingaccessory_predictions_cv)))
Cross-Predicted Accuracy: 0.5113679167194116
Mean Squared Error: 0.4886320832805883
With Simple Linear Regression, using cross validation, we get r2 score of 0.51 and mse score of 0.49

Determining Feature Importance for gamingaccessory from model without cv

In [264]:
# linear regression model parameters
#Limiting floats output to 3 decimal points
pd.set_option('display.float_format', lambda x: '{:.3f}'.format(x)) 
pd.set_option('display.precision',1)


gamingaccessory_lr_model_parameters = list(gamingaccessory_model.coef_)
gamingaccessory_lr_model_parameters.insert(0, gamingaccessory_model.intercept_)
gamingaccessory_lr_model_parameters = [round(x, 3) for x in gamingaccessory_lr_model_parameters]
cols = X_gamingaccessory_test.columns
cols = cols.insert(0, "constant")
gamingaccessory_lr_coef = list(zip(cols, gamingaccessory_lr_model_parameters))
gamingaccessory_lr_coef
Out[264]:
[('constant', 0.021),
 ('Discount%', -0.003),
 ('deliverybdays', 0.015),
 ('deliverycdays', 0.015),
 ('sla', 0.051),
 ('product_procurement_sla', 0.025),
 ('is_cod', 0.064),
 ('is_mass_market', 0.167),
 ('product_vertical_gamecontrolmount', -0.013),
 ('product_vertical_gamepad', 0.201),
 ('product_vertical_gamingaccessorykit', 0.126),
 ('product_vertical_gamingadapter', 0.039),
 ('product_vertical_gamingchargingstation', 0.009),
 ('product_vertical_gamingheadset', 0.183),
 ('product_vertical_gamingkeyboard', 0.068),
 ('product_vertical_gamingmemorycard', 0.028),
 ('product_vertical_gamingmouse', 0.107),
 ('product_vertical_gamingmousepad', -0.003),
 ('product_vertical_gamingspeaker', 0.038),
 ('product_vertical_joystickgamingwheel', 0.089),
 ('product_vertical_motioncontroller', 0.056),
 ('product_vertical_tvoutcableaccessory', 0.091),
 ('payday_week', -0.035),
 ('holiday_week', -0.016),
 ('Total Investment', 0.001),
 ('Total Investment_SMA_3', 0.001),
 ('Total Investment_SMA_5', 0.021),
 ('Total Investment_EMA_8', -0.008),
 ('Total_Investment_Ad_Stock', 0.004),
 ('TV', 0.078),
 ('TV_SMA_3', -0.017),
 ('TV_SMA_5', 0.008),
 ('TV_EMA_8', 0.004),
 ('TV_Ad_Stock', 0.02),
 ('Digital', -0.018),
 ('Digital_SMA_3', 0.029),
 ('Digital_SMA_5', -0.002),
 ('Digital_EMA_8', -0.037),
 ('Digital_Ad_Stock', -0.002),
 ('Sponsorship', -0.022),
 ('Sponsorship_SMA_3', 0.02),
 ('Sponsorship_SMA_5', 0.032),
 ('Sponsorship_EMA_8', -0.004),
 ('Sponsorship_Ad_Stock', 0.007),
 ('Content Marketing', -0.035),
 ('Content Marketing_SMA_3', 0.009),
 ('Content Marketing_SMA_5', 0.014),
 ('Content Marketing_EMA_8', -0.031),
 ('Content_Marketing_Ad_Stock', -0.012),
 ('Online marketing', 0.016),
 ('Online marketing_SMA_3', -0.019),
 ('Online marketing_SMA_5', 0.002),
 ('Online marketing_EMA_8', -0.01),
 ('Online_marketing_Ad_Stock', -0.005),
 ('Affiliates', 0.048),
 ('Affiliates_SMA_3', -0.003),
 ('Affiliates_SMA_5', 0.008),
 ('Affiliates_EMA_8', 0.0),
 ('Affiliates_Ad_Stock', 0.012),
 ('SEM', -0.027),
 ('SEM_SMA_3', 0.017),
 ('SEM_SMA_5', -0.0),
 ('SEM_EMA_8', -0.038),
 ('SEM_Ad_Stock', -0.009),
 ('Radio', 0.038),
 ('Radio_SMA_3', -0.065),
 ('Radio_SMA_5', 0.011),
 ('Radio_EMA_8', 0.027),
 ('Radio_Ad_Stock', -0.002),
 ('Other', 0.06),
 ('Other_SMA_3', -0.067),
 ('Other_SMA_5', 0.01),
 ('Other_EMA_8', 0.03),
 ('Other_Ad_Stock', 0.006),
 ('NPS', -0.006),
 ('NPS_SMA_3', 0.034),
 ('NPS_SMA_5', -0.037),
 ('Stock Index', 0.044),
 ('Stock Index_SMA_3', 0.03),
 ('Stock Index_SMA_5', -0.046),
 ('Max Temp', -0.041),
 ('Min Temp', 0.053),
 ('Mean Temp', -0.016),
 ('Heat Deg Days', 0.047),
 ('Cool Deg Days', 0.085),
 ('Total Rain (mm)', -0.022),
 ('Total Snow (cm)', -0.005),
 ('Total Precip (mm)', -0.022),
 ('Snow on Grnd (cm)', -0.024),
 ('Sale', 0.045)]
In [265]:
gamingaccessory_lr_coef_df = pd.DataFrame(gamingaccessory_lr_coef)
col_rename = {0:'Features',1: 'Coefficients'}
gamingaccessory_lr_coef_df = gamingaccessory_lr_coef_df.rename(columns=col_rename)
gamingaccessory_lr_coef_df = gamingaccessory_lr_coef_df.iloc[1:,:]
gamingaccessory_lr_coef_df = gamingaccessory_lr_coef_df.loc[gamingaccessory_lr_coef_df['Coefficients']!=0.0]
gamingaccessory_lr_coef_df = gamingaccessory_lr_coef_df.sort_values(by=['Coefficients'], ascending = False)
gamingaccessory_lr_coef_df
Out[265]:
Features Coefficients
9 product_vertical_gamepad 0.201
13 product_vertical_gamingheadset 0.183
7 is_mass_market 0.167
10 product_vertical_gamingaccessorykit 0.126
16 product_vertical_gamingmouse 0.107
21 product_vertical_tvoutcableaccessory 0.091
19 product_vertical_joystickgamingwheel 0.089
84 Cool Deg Days 0.085
29 TV 0.078
14 product_vertical_gamingkeyboard 0.068
6 is_cod 0.064
69 Other 0.060
20 product_vertical_motioncontroller 0.056
81 Min Temp 0.053
4 sla 0.051
54 Affiliates 0.048
83 Heat Deg Days 0.047
89 Sale 0.045
77 Stock Index 0.044
11 product_vertical_gamingadapter 0.039
18 product_vertical_gamingspeaker 0.038
64 Radio 0.038
75 NPS_SMA_3 0.034
41 Sponsorship_SMA_5 0.032
78 Stock Index_SMA_3 0.030
72 Other_EMA_8 0.030
35 Digital_SMA_3 0.029
15 product_vertical_gamingmemorycard 0.028
67 Radio_EMA_8 0.027
5 product_procurement_sla 0.025
26 Total Investment_SMA_5 0.021
40 Sponsorship_SMA_3 0.020
33 TV_Ad_Stock 0.020
60 SEM_SMA_3 0.017
49 Online marketing 0.016
2 deliverybdays 0.015
3 deliverycdays 0.015
46 Content Marketing_SMA_5 0.014
58 Affiliates_Ad_Stock 0.012
66 Radio_SMA_5 0.011
71 Other_SMA_5 0.010
45 Content Marketing_SMA_3 0.009
12 product_vertical_gamingchargingstation 0.009
31 TV_SMA_5 0.008
56 Affiliates_SMA_5 0.008
43 Sponsorship_Ad_Stock 0.007
73 Other_Ad_Stock 0.006
32 TV_EMA_8 0.004
28 Total_Investment_Ad_Stock 0.004
51 Online marketing_SMA_5 0.002
25 Total Investment_SMA_3 0.001
24 Total Investment 0.001
38 Digital_Ad_Stock -0.002
36 Digital_SMA_5 -0.002
68 Radio_Ad_Stock -0.002
1 Discount% -0.003
55 Affiliates_SMA_3 -0.003
17 product_vertical_gamingmousepad -0.003
42 Sponsorship_EMA_8 -0.004
53 Online_marketing_Ad_Stock -0.005
86 Total Snow (cm) -0.005
74 NPS -0.006
27 Total Investment_EMA_8 -0.008
63 SEM_Ad_Stock -0.009
52 Online marketing_EMA_8 -0.010
48 Content_Marketing_Ad_Stock -0.012
8 product_vertical_gamecontrolmount -0.013
82 Mean Temp -0.016
23 holiday_week -0.016
30 TV_SMA_3 -0.017
34 Digital -0.018
50 Online marketing_SMA_3 -0.019
39 Sponsorship -0.022
87 Total Precip (mm) -0.022
85 Total Rain (mm) -0.022
88 Snow on Grnd (cm) -0.024
59 SEM -0.027
47 Content Marketing_EMA_8 -0.031
44 Content Marketing -0.035
22 payday_week -0.035
76 NPS_SMA_5 -0.037
37 Digital_EMA_8 -0.037
62 SEM_EMA_8 -0.038
80 Max Temp -0.041
79 Stock Index_SMA_5 -0.046
65 Radio_SMA_3 -0.065
70 Other_SMA_3 -0.067

Plotting the Features in descending order of Importance for gamingaccessory

In [266]:
# Slightly alter the figure size to make it more horizontal.
plt.figure(figsize=(10, 15), dpi=100, facecolor='w', edgecolor='k', frameon='True')
sns.barplot(y='Features', x='Coefficients', palette='husl', data=gamingaccessory_lr_coef_df, estimator=np.sum)
# Automatically adjust subplot params so that the subplotS fits in to the figure area.
plt.tight_layout()

# display the plot
plt.show()
The 5 most important features affecting GMV(Revenue) for gamingaccessory are:
Features Coefficients
product_vertical_gamepad 0.201
product_vertical_gamingheadset 0.183
is_mass_market 0.167
product_vertical_gamingaccessorykit 0.126
product_vertical_gamingmouse 0.107

Building Linear Regression model for homeaudio

In [267]:
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
from sklearn.metrics import mean_squared_error

homeaudio_model = LinearRegression().fit(X_homeaudio_train, y_homeaudio_train)
y_homeaudio_test_pred = homeaudio_model.predict(X_homeaudio_test)

print('R2 Score: {}'.format(r2_score(y_homeaudio_test, y_homeaudio_test_pred)))
print('Mean Squared Error: {}'.format(mean_squared_error(y_homeaudio_test, y_homeaudio_test_pred)))
R2 Score: 0.9608086220945999
Mean Squared Error: 0.09397173521656972
With Simple Linear Regression, we get a r2 score of 0.96 and mse of 0.09

Building Linear Regression model for homeaudio using K-fold Cross Validation

We will use GridSearchCV method and 5 fold cross validation method for our linear regression.

In [268]:
y_homeaudio = homeaudio_df.pop('gmv')
X_homeaudio = homeaudio_df
In [269]:
# Make cross validated predictions
from sklearn.model_selection import cross_val_score,cross_val_predict
from sklearn import metrics

homeaudio_model_cv = LinearRegression().fit(X_homeaudio, y_homeaudio)
homeaudio_predictions_cv = cross_val_predict(homeaudio_model_cv, X_homeaudio, y_homeaudio, cv=5)
accuracy = metrics.r2_score(y_homeaudio, homeaudio_predictions_cv)
print("Cross-Predicted Accuracy:", accuracy)
print('Mean Squared Error: {}'.format(mean_squared_error(y_homeaudio, homeaudio_predictions_cv)))
Cross-Predicted Accuracy: 0.7260604462246434
Mean Squared Error: 0.2739395537753566
With Simple Linear Regression, using cross validation, we get a r2 score of 0.73 and mse score of 0.27

Determining Feature Importance for homeaudio from model without cv

In [270]:
# linear regression model parameters
#Limiting floats output to 3 decimal points
pd.set_option('display.float_format', lambda x: '{:.3f}'.format(x)) 
pd.set_option('display.precision',1)


homeaudio_lr_model_parameters = list(homeaudio_model.coef_)
homeaudio_lr_model_parameters.insert(0, homeaudio_model.intercept_)
homeaudio_lr_model_parameters = [round(x, 3) for x in homeaudio_lr_model_parameters]
cols = homeaudio_test.columns
cols = cols.insert(0, "constant")
homeaudio_lr_coef = list(zip(cols, homeaudio_lr_model_parameters))
homeaudio_lr_coef
Out[270]:
[('constant', -0.009),
 ('Discount%', 0.015),
 ('deliverybdays', 0.016),
 ('deliverycdays', -0.001),
 ('sla', -0.121),
 ('product_procurement_sla', -0.057),
 ('is_cod', 0.131),
 ('is_mass_market', 0.148),
 ('product_vertical_djcontroller', 0.016),
 ('product_vertical_dock', 0.111),
 ('product_vertical_dockingstation', 0.019),
 ('product_vertical_fmradio', 0.133),
 ('product_vertical_hifisystem', 0.05),
 ('product_vertical_homeaudiospeaker', 0.392),
 ('product_vertical_karaokeplayer', -0.0),
 ('product_vertical_slingbox', 0.054),
 ('product_vertical_soundmixer', 0.028),
 ('product_vertical_voicerecorder', -0.069),
 ('payday_week', 0.056),
 ('holiday_week', -0.049),
 ('Total Investment', -0.05),
 ('Total Investment_SMA_3', 0.019),
 ('Total Investment_SMA_5', -0.029),
 ('Total Investment_EMA_8', 0.073),
 ('Total_Investment_Ad_Stock', -0.064),
 ('TV', -0.037),
 ('TV_SMA_3', 0.025),
 ('TV_SMA_5', 0.026),
 ('TV_EMA_8', -0.012),
 ('TV_Ad_Stock', -0.022),
 ('Digital', 0.009),
 ('Digital_SMA_3', 0.139),
 ('Digital_SMA_5', -0.087),
 ('Digital_EMA_8', 0.057),
 ('Digital_Ad_Stock', -0.007),
 ('Sponsorship', -0.076),
 ('Sponsorship_SMA_3', -0.015),
 ('Sponsorship_SMA_5', -0.04),
 ('Sponsorship_EMA_8', 0.056),
 ('Sponsorship_Ad_Stock', -0.093),
 ('Content Marketing', -0.001),
 ('Content Marketing_SMA_3', 0.056),
 ('Content Marketing_SMA_5', -0.136),
 ('Content Marketing_EMA_8', 0.071),
 ('Content_Marketing_Ad_Stock', -0.052),
 ('Online marketing', -0.036),
 ('Online marketing_SMA_3', -0.034),
 ('Online marketing_SMA_5', 0.011),
 ('Online marketing_EMA_8', 0.089),
 ('Online_marketing_Ad_Stock', -0.046),
 ('Affiliates', 0.008),
 ('Affiliates_SMA_3', 0.016),
 ('Affiliates_SMA_5', 0.05),
 ('Affiliates_EMA_8', 0.103),
 ('Affiliates_Ad_Stock', 0.003),
 ('SEM', 0.014),
 ('SEM_SMA_3', 0.094),
 ('SEM_SMA_5', -0.087),
 ('SEM_EMA_8', 0.091),
 ('SEM_Ad_Stock', -0.025),
 ('Radio', 0.0),
 ('Radio_SMA_3', -0.06),
 ('Radio_SMA_5', 0.012),
 ('Radio_EMA_8', -0.013),
 ('Radio_Ad_Stock', -0.015),
 ('Other', -0.008),
 ('Other_SMA_3', 0.021),
 ('Other_SMA_5', 0.06),
 ('Other_EMA_8', -0.035),
 ('Other_Ad_Stock', 0.029),
 ('NPS', 0.048),
 ('NPS_SMA_3', -0.002),
 ('NPS_SMA_5', 0.024),
 ('Stock Index', -0.133),
 ('Stock Index_SMA_3', -0.048),
 ('Stock Index_SMA_5', 0.038),
 ('Max Temp', -0.06),
 ('Min Temp', 0.016),
 ('Mean Temp', 0.11),
 ('Heat Deg Days', -0.14),
 ('Cool Deg Days', -0.018),
 ('Total Rain (mm)', -0.014),
 ('Total Snow (cm)', 0.048),
 ('Total Precip (mm)', -0.003),
 ('Snow on Grnd (cm)', 0.018),
 ('Sale', -0.008)]
In [271]:
homeaudio_lr_coef_df = pd.DataFrame(homeaudio_lr_coef)
col_rename = {0:'Features',1: 'Coefficients'}
homeaudio_lr_coef_df = homeaudio_lr_coef_df.rename(columns=col_rename)
homeaudio_lr_coef_df = homeaudio_lr_coef_df.iloc[1:,:]
homeaudio_lr_coef_df = homeaudio_lr_coef_df.loc[homeaudio_lr_coef_df['Coefficients']!=0.0]
homeaudio_lr_coef_df = homeaudio_lr_coef_df.sort_values(by=['Coefficients'], ascending = False)
homeaudio_lr_coef_df
Out[271]:
Features Coefficients
13 product_vertical_homeaudiospeaker 0.392
7 is_mass_market 0.148
31 Digital_SMA_3 0.139
11 product_vertical_fmradio 0.133
6 is_cod 0.131
9 product_vertical_dock 0.111
78 Mean Temp 0.110
53 Affiliates_EMA_8 0.103
56 SEM_SMA_3 0.094
58 SEM_EMA_8 0.091
48 Online marketing_EMA_8 0.089
23 Total Investment_EMA_8 0.073
43 Content Marketing_EMA_8 0.071
67 Other_SMA_5 0.060
33 Digital_EMA_8 0.057
41 Content Marketing_SMA_3 0.056
38 Sponsorship_EMA_8 0.056
18 payday_week 0.056
15 product_vertical_slingbox 0.054
52 Affiliates_SMA_5 0.050
12 product_vertical_hifisystem 0.050
70 NPS 0.048
82 Total Snow (cm) 0.048
75 Stock Index_SMA_5 0.038
69 Other_Ad_Stock 0.029
16 product_vertical_soundmixer 0.028
27 TV_SMA_5 0.026
26 TV_SMA_3 0.025
72 NPS_SMA_5 0.024
66 Other_SMA_3 0.021
10 product_vertical_dockingstation 0.019
21 Total Investment_SMA_3 0.019
84 Snow on Grnd (cm) 0.018
8 product_vertical_djcontroller 0.016
77 Min Temp 0.016
51 Affiliates_SMA_3 0.016
2 deliverybdays 0.016
1 Discount% 0.015
55 SEM 0.014
62 Radio_SMA_5 0.012
47 Online marketing_SMA_5 0.011
30 Digital 0.009
50 Affiliates 0.008
54 Affiliates_Ad_Stock 0.003
40 Content Marketing -0.001
3 deliverycdays -0.001
71 NPS_SMA_3 -0.002
83 Total Precip (mm) -0.003
34 Digital_Ad_Stock -0.007
65 Other -0.008
85 Sale -0.008
28 TV_EMA_8 -0.012
63 Radio_EMA_8 -0.013
81 Total Rain (mm) -0.014
64 Radio_Ad_Stock -0.015
36 Sponsorship_SMA_3 -0.015
80 Cool Deg Days -0.018
29 TV_Ad_Stock -0.022
59 SEM_Ad_Stock -0.025
22 Total Investment_SMA_5 -0.029
46 Online marketing_SMA_3 -0.034
68 Other_EMA_8 -0.035
45 Online marketing -0.036
25 TV -0.037
37 Sponsorship_SMA_5 -0.040
49 Online_marketing_Ad_Stock -0.046
74 Stock Index_SMA_3 -0.048
19 holiday_week -0.049
20 Total Investment -0.050
44 Content_Marketing_Ad_Stock -0.052
5 product_procurement_sla -0.057
61 Radio_SMA_3 -0.060
76 Max Temp -0.060
24 Total_Investment_Ad_Stock -0.064
17 product_vertical_voicerecorder -0.069
35 Sponsorship -0.076
57 SEM_SMA_5 -0.087
32 Digital_SMA_5 -0.087
39 Sponsorship_Ad_Stock -0.093
4 sla -0.121
73 Stock Index -0.133
42 Content Marketing_SMA_5 -0.136
79 Heat Deg Days -0.140

Plotting the Features in descending order of Importance for homeaudio

In [272]:
# Slightly alter the figure size to make it more horizontal.
plt.figure(figsize=(10, 15), dpi=100, facecolor='w', edgecolor='k', frameon='True')
sns.barplot(y='Features', x='Coefficients', palette='husl', data=homeaudio_lr_coef_df, estimator=np.sum)
# Automatically adjust subplot params so that the subplotS fits in to the figure area.
plt.tight_layout()

# display the plot
plt.show()
The 5 most important features affecting GMV(Revenue) for homeaudio are:
Features Coefficients
product_vertical_homeaudiospeaker 0.392
is_mass_market 0.148
Digital_SMA_3 0.139
product_vertical_fmradio 0.133
is_cod 0.131

Multiplicative Model

The Linear Regression Model that we built earlier is an additive model that has been an implicit assumption that the different KPIs affect the revenue additively.

Y = α + β1X1 + β2X2 + β3X3 + β4X4 + β5X5 + ϵ

However when there are interactions between the KPIs, we go for a multiplicative model.

To fit a multiplicative model, take logarithms of the data(on both sides of the model), then analyse the log data as before.

Y = e^α .X1^β1 . X2^β2 . X3^β3 . X4^β4 . X5^β5 + ϵ

lnY = α + β1ln(X1) + β2ln(X2) + β3ln(X3) + β4ln(X4) + β5ln(X5) + ϵ'

In [273]:
homeaudio_org_df.head()
Out[273]:
Week gmv Discount% deliverybdays deliverycdays sla product_procurement_sla is_cod is_mass_market product_vertical_djcontroller product_vertical_dock product_vertical_dockingstation product_vertical_fmradio product_vertical_hifisystem product_vertical_homeaudiospeaker product_vertical_karaokeplayer product_vertical_slingbox product_vertical_soundmixer product_vertical_voicerecorder payday_week holiday_week Total Investment Total Investment_SMA_3 Total Investment_SMA_5 Total Investment_EMA_8 Total_Investment_Ad_Stock TV TV_SMA_3 TV_SMA_5 TV_EMA_8 TV_Ad_Stock Digital Digital_SMA_3 Digital_SMA_5 Digital_EMA_8 Digital_Ad_Stock Sponsorship Sponsorship_SMA_3 Sponsorship_SMA_5 Sponsorship_EMA_8 Sponsorship_Ad_Stock Content Marketing Content Marketing_SMA_3 Content Marketing_SMA_5 Content Marketing_EMA_8 Content_Marketing_Ad_Stock Online marketing Online marketing_SMA_3 Online marketing_SMA_5 Online marketing_EMA_8 Online_marketing_Ad_Stock Affiliates Affiliates_SMA_3 Affiliates_SMA_5 Affiliates_EMA_8 Affiliates_Ad_Stock SEM SEM_SMA_3 SEM_SMA_5 SEM_EMA_8 SEM_Ad_Stock Radio Radio_SMA_3 Radio_SMA_5 Radio_EMA_8 Radio_Ad_Stock Other Other_SMA_3 Other_SMA_5 Other_EMA_8 Other_Ad_Stock NPS NPS_SMA_3 NPS_SMA_5 Stock Index Stock Index_SMA_3 Stock Index_SMA_5 Max Temp Min Temp Mean Temp Heat Deg Days Cool Deg Days Total Rain (mm) Total Snow (cm) Total Precip (mm) Snow on Grnd (cm) Sale
25 28 4573783.133 31.451 0.000 0.000 7.369 2.863 1583 1366 8 33 1 516.000 23 1374.000 0 0 0 63 0 0 4.265 0.000 0.000 4.265 4.265 0.054 0.000 0.000 0.054 0.054 0.633 0.000 0.000 0.633 0.633 1.854 0.000 0.000 1.854 1.854 0.000 0.000 0.000 0.000 0.000 0.332 0.000 0.000 0.332 0.332 0.137 0.000 0.000 0.137 0.137 1.256 0.000 0.000 1.256 1.256 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 0.000 0.000 1177.000 0.000 0.000 28.000 12.500 20.100 0.283 2.383 4.417 0.000 4.417 0.000 0
26 29 5371525.000 32.967 0.000 0.000 6.985 2.746 1868 1610 7 50 1 574.000 42 1623.000 0 0 0 69 1 0 4.265 0.000 0.000 4.265 6.824 0.054 0.000 0.000 0.054 0.086 0.633 0.000 0.000 0.633 1.013 1.854 0.000 0.000 1.854 2.966 0.000 0.000 0.000 0.000 0.000 0.332 0.000 0.000 0.332 0.531 0.137 0.000 0.000 0.137 0.219 1.256 0.000 0.000 1.256 2.010 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 0.000 0.000 1177.000 0.000 0.000 33.000 11.000 23.183 0.000 5.183 1.400 0.000 1.400 0.000 2
27 30 4679828.000 32.357 0.000 0.000 7.072 2.861 1758 1569 4 56 0 577.000 36 1430.000 0 0 0 46 0 0 4.265 4.265 0.000 4.265 8.359 0.054 0.054 0.000 0.054 0.106 0.633 0.633 0.000 0.633 1.241 1.854 1.854 0.000 1.854 3.634 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.000 0.332 0.651 0.137 0.137 0.000 0.137 0.269 1.256 1.256 0.000 1.256 2.462 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 54.600 0.000 1177.000 1177.000 0.000 31.500 14.500 23.060 0.000 5.060 1.080 0.000 1.080 0.000 0
28 31 3451151.000 32.208 0.000 0.000 7.201 2.735 1244 1072 2 43 0 420.000 20 1025.000 0 0 0 44 1 0 4.265 4.265 0.000 4.265 9.281 0.054 0.054 0.000 0.054 0.118 0.633 0.633 0.000 0.633 1.377 1.854 1.854 0.000 1.854 4.034 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.000 0.332 0.722 0.137 0.137 0.000 0.137 0.298 1.256 1.256 0.000 1.256 2.733 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 54.600 0.000 1177.000 1177.000 0.000 33.500 16.000 24.567 0.000 6.567 4.633 0.000 4.633 0.000 0
29 32 2599.000 16.130 0.000 0.000 9.000 2.000 0 0 0 0 0 0.000 0 1.000 0 0 0 0 0 0 1.013 3.181 3.615 3.542 6.581 0.001 0.036 0.043 0.042 0.072 0.256 0.507 0.558 0.549 1.082 0.213 1.307 1.526 1.489 2.634 0.000 0.000 0.000 0.000 0.000 0.026 0.230 0.271 0.264 0.459 0.015 0.096 0.113 0.110 0.194 0.503 1.005 1.105 1.089 2.143 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 59.987 56.395 55.677 1206.000 1186.667 1182.800 28.500 15.000 21.650 0.000 3.650 0.350 0.000 0.350 0.000 0
In [274]:
# Making copy of dataframes from the original ones
cameraaccessory_mul_df = cameraaccessory_org_df.copy()
gamingaccessory_mul_df = gamingaccessory_org_df.copy()
homeaudio_mul_df = homeaudio_org_df.copy()
homeaudio_mul_df.head()
Out[274]:
Week gmv Discount% deliverybdays deliverycdays sla product_procurement_sla is_cod is_mass_market product_vertical_djcontroller product_vertical_dock product_vertical_dockingstation product_vertical_fmradio product_vertical_hifisystem product_vertical_homeaudiospeaker product_vertical_karaokeplayer product_vertical_slingbox product_vertical_soundmixer product_vertical_voicerecorder payday_week holiday_week Total Investment Total Investment_SMA_3 Total Investment_SMA_5 Total Investment_EMA_8 Total_Investment_Ad_Stock TV TV_SMA_3 TV_SMA_5 TV_EMA_8 TV_Ad_Stock Digital Digital_SMA_3 Digital_SMA_5 Digital_EMA_8 Digital_Ad_Stock Sponsorship Sponsorship_SMA_3 Sponsorship_SMA_5 Sponsorship_EMA_8 Sponsorship_Ad_Stock Content Marketing Content Marketing_SMA_3 Content Marketing_SMA_5 Content Marketing_EMA_8 Content_Marketing_Ad_Stock Online marketing Online marketing_SMA_3 Online marketing_SMA_5 Online marketing_EMA_8 Online_marketing_Ad_Stock Affiliates Affiliates_SMA_3 Affiliates_SMA_5 Affiliates_EMA_8 Affiliates_Ad_Stock SEM SEM_SMA_3 SEM_SMA_5 SEM_EMA_8 SEM_Ad_Stock Radio Radio_SMA_3 Radio_SMA_5 Radio_EMA_8 Radio_Ad_Stock Other Other_SMA_3 Other_SMA_5 Other_EMA_8 Other_Ad_Stock NPS NPS_SMA_3 NPS_SMA_5 Stock Index Stock Index_SMA_3 Stock Index_SMA_5 Max Temp Min Temp Mean Temp Heat Deg Days Cool Deg Days Total Rain (mm) Total Snow (cm) Total Precip (mm) Snow on Grnd (cm) Sale
25 28 4573783.133 31.451 0.000 0.000 7.369 2.863 1583 1366 8 33 1 516.000 23 1374.000 0 0 0 63 0 0 4.265 0.000 0.000 4.265 4.265 0.054 0.000 0.000 0.054 0.054 0.633 0.000 0.000 0.633 0.633 1.854 0.000 0.000 1.854 1.854 0.000 0.000 0.000 0.000 0.000 0.332 0.000 0.000 0.332 0.332 0.137 0.000 0.000 0.137 0.137 1.256 0.000 0.000 1.256 1.256 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 0.000 0.000 1177.000 0.000 0.000 28.000 12.500 20.100 0.283 2.383 4.417 0.000 4.417 0.000 0
26 29 5371525.000 32.967 0.000 0.000 6.985 2.746 1868 1610 7 50 1 574.000 42 1623.000 0 0 0 69 1 0 4.265 0.000 0.000 4.265 6.824 0.054 0.000 0.000 0.054 0.086 0.633 0.000 0.000 0.633 1.013 1.854 0.000 0.000 1.854 2.966 0.000 0.000 0.000 0.000 0.000 0.332 0.000 0.000 0.332 0.531 0.137 0.000 0.000 0.137 0.219 1.256 0.000 0.000 1.256 2.010 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 0.000 0.000 1177.000 0.000 0.000 33.000 11.000 23.183 0.000 5.183 1.400 0.000 1.400 0.000 2
27 30 4679828.000 32.357 0.000 0.000 7.072 2.861 1758 1569 4 56 0 577.000 36 1430.000 0 0 0 46 0 0 4.265 4.265 0.000 4.265 8.359 0.054 0.054 0.000 0.054 0.106 0.633 0.633 0.000 0.633 1.241 1.854 1.854 0.000 1.854 3.634 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.000 0.332 0.651 0.137 0.137 0.000 0.137 0.269 1.256 1.256 0.000 1.256 2.462 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 54.600 0.000 1177.000 1177.000 0.000 31.500 14.500 23.060 0.000 5.060 1.080 0.000 1.080 0.000 0
28 31 3451151.000 32.208 0.000 0.000 7.201 2.735 1244 1072 2 43 0 420.000 20 1025.000 0 0 0 44 1 0 4.265 4.265 0.000 4.265 9.281 0.054 0.054 0.000 0.054 0.118 0.633 0.633 0.000 0.633 1.377 1.854 1.854 0.000 1.854 4.034 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.000 0.332 0.722 0.137 0.137 0.000 0.137 0.298 1.256 1.256 0.000 1.256 2.733 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 54.600 0.000 1177.000 1177.000 0.000 33.500 16.000 24.567 0.000 6.567 4.633 0.000 4.633 0.000 0
29 32 2599.000 16.130 0.000 0.000 9.000 2.000 0 0 0 0 0 0.000 0 1.000 0 0 0 0 0 0 1.013 3.181 3.615 3.542 6.581 0.001 0.036 0.043 0.042 0.072 0.256 0.507 0.558 0.549 1.082 0.213 1.307 1.526 1.489 2.634 0.000 0.000 0.000 0.000 0.000 0.026 0.230 0.271 0.264 0.459 0.015 0.096 0.113 0.110 0.194 0.503 1.005 1.105 1.089 2.143 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 59.987 56.395 55.677 1206.000 1186.667 1182.800 28.500 15.000 21.650 0.000 3.650 0.350 0.000 0.350 0.000 0
In [275]:
# Checking for total count and percentage of null values in all columns of the dataframe.

total = pd.DataFrame(homeaudio_mul_df.isnull().sum().sort_values(ascending=False), columns=['Total'])
percentage = pd.DataFrame(round(100*(homeaudio_mul_df.isnull().sum()/homeaudio_mul_df.shape[0]),2).sort_values(ascending=False)\
                          ,columns=['Percentage'])

pd.concat([total, percentage], axis = 1).head()
Out[275]:
Total Percentage
Sale 0 0.000
Digital 0 0.000
Total Investment_SMA_5 0 0.000
Total Investment_EMA_8 0 0.000
Total_Investment_Ad_Stock 0 0.000

We will drop the Week column as it is a row identifier and will not help in prediction of revenue

In [276]:
# removing columns
cameraaccessory_mul_df = cameraaccessory_mul_df.drop('Week', axis=1)
gamingaccessory_mul_df = gamingaccessory_mul_df.drop('Week', axis=1)
homeaudio_mul_df = homeaudio_mul_df.drop('Week', axis=1)
homeaudio_mul_df.head()
Out[276]:
gmv Discount% deliverybdays deliverycdays sla product_procurement_sla is_cod is_mass_market product_vertical_djcontroller product_vertical_dock product_vertical_dockingstation product_vertical_fmradio product_vertical_hifisystem product_vertical_homeaudiospeaker product_vertical_karaokeplayer product_vertical_slingbox product_vertical_soundmixer product_vertical_voicerecorder payday_week holiday_week Total Investment Total Investment_SMA_3 Total Investment_SMA_5 Total Investment_EMA_8 Total_Investment_Ad_Stock TV TV_SMA_3 TV_SMA_5 TV_EMA_8 TV_Ad_Stock Digital Digital_SMA_3 Digital_SMA_5 Digital_EMA_8 Digital_Ad_Stock Sponsorship Sponsorship_SMA_3 Sponsorship_SMA_5 Sponsorship_EMA_8 Sponsorship_Ad_Stock Content Marketing Content Marketing_SMA_3 Content Marketing_SMA_5 Content Marketing_EMA_8 Content_Marketing_Ad_Stock Online marketing Online marketing_SMA_3 Online marketing_SMA_5 Online marketing_EMA_8 Online_marketing_Ad_Stock Affiliates Affiliates_SMA_3 Affiliates_SMA_5 Affiliates_EMA_8 Affiliates_Ad_Stock SEM SEM_SMA_3 SEM_SMA_5 SEM_EMA_8 SEM_Ad_Stock Radio Radio_SMA_3 Radio_SMA_5 Radio_EMA_8 Radio_Ad_Stock Other Other_SMA_3 Other_SMA_5 Other_EMA_8 Other_Ad_Stock NPS NPS_SMA_3 NPS_SMA_5 Stock Index Stock Index_SMA_3 Stock Index_SMA_5 Max Temp Min Temp Mean Temp Heat Deg Days Cool Deg Days Total Rain (mm) Total Snow (cm) Total Precip (mm) Snow on Grnd (cm) Sale
25 4573783.133 31.451 0.000 0.000 7.369 2.863 1583 1366 8 33 1 516.000 23 1374.000 0 0 0 63 0 0 4.265 0.000 0.000 4.265 4.265 0.054 0.000 0.000 0.054 0.054 0.633 0.000 0.000 0.633 0.633 1.854 0.000 0.000 1.854 1.854 0.000 0.000 0.000 0.000 0.000 0.332 0.000 0.000 0.332 0.332 0.137 0.000 0.000 0.137 0.137 1.256 0.000 0.000 1.256 1.256 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 0.000 0.000 1177.000 0.000 0.000 28.000 12.500 20.100 0.283 2.383 4.417 0.000 4.417 0.000 0
26 5371525.000 32.967 0.000 0.000 6.985 2.746 1868 1610 7 50 1 574.000 42 1623.000 0 0 0 69 1 0 4.265 0.000 0.000 4.265 6.824 0.054 0.000 0.000 0.054 0.086 0.633 0.000 0.000 0.633 1.013 1.854 0.000 0.000 1.854 2.966 0.000 0.000 0.000 0.000 0.000 0.332 0.000 0.000 0.332 0.531 0.137 0.000 0.000 0.137 0.219 1.256 0.000 0.000 1.256 2.010 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 0.000 0.000 1177.000 0.000 0.000 33.000 11.000 23.183 0.000 5.183 1.400 0.000 1.400 0.000 2
27 4679828.000 32.357 0.000 0.000 7.072 2.861 1758 1569 4 56 0 577.000 36 1430.000 0 0 0 46 0 0 4.265 4.265 0.000 4.265 8.359 0.054 0.054 0.000 0.054 0.106 0.633 0.633 0.000 0.633 1.241 1.854 1.854 0.000 1.854 3.634 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.000 0.332 0.651 0.137 0.137 0.000 0.137 0.269 1.256 1.256 0.000 1.256 2.462 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 54.600 0.000 1177.000 1177.000 0.000 31.500 14.500 23.060 0.000 5.060 1.080 0.000 1.080 0.000 0
28 3451151.000 32.208 0.000 0.000 7.201 2.735 1244 1072 2 43 0 420.000 20 1025.000 0 0 0 44 1 0 4.265 4.265 0.000 4.265 9.281 0.054 0.054 0.000 0.054 0.118 0.633 0.633 0.000 0.633 1.377 1.854 1.854 0.000 1.854 4.034 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.000 0.332 0.722 0.137 0.137 0.000 0.137 0.298 1.256 1.256 0.000 1.256 2.733 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 54.600 0.000 1177.000 1177.000 0.000 33.500 16.000 24.567 0.000 6.567 4.633 0.000 4.633 0.000 0
29 2599.000 16.130 0.000 0.000 9.000 2.000 0 0 0 0 0 0.000 0 1.000 0 0 0 0 0 0 1.013 3.181 3.615 3.542 6.581 0.001 0.036 0.043 0.042 0.072 0.256 0.507 0.558 0.549 1.082 0.213 1.307 1.526 1.489 2.634 0.000 0.000 0.000 0.000 0.000 0.026 0.230 0.271 0.264 0.459 0.015 0.096 0.113 0.110 0.194 0.503 1.005 1.105 1.089 2.143 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 59.987 56.395 55.677 1206.000 1186.667 1182.800 28.500 15.000 21.650 0.000 3.650 0.350 0.000 0.350 0.000 0
In [277]:
# Checking for total count and percentage of null values in all columns of the dataframe.

total = pd.DataFrame(homeaudio_mul_df.isnull().sum().sort_values(ascending=False), columns=['Total'])
percentage = pd.DataFrame(round(100*(homeaudio_mul_df.isnull().sum()/homeaudio_mul_df.shape[0]),2).sort_values(ascending=False)\
                          ,columns=['Percentage'])

pd.concat([total, percentage], axis = 1).head()
Out[277]:
Total Percentage
Sale 0 0.000
Digital_SMA_3 0 0.000
Total Investment_EMA_8 0 0.000
Total_Investment_Ad_Stock 0 0.000
TV 0 0.000

Taking Logarithm of both the Dependent and the independent variables

After taking log, all 0 values will be transformed to inf. Retransforming these values to 0.
In [278]:
cameraaccessory_mul_df = cameraaccessory_mul_df.applymap(lambda x: np.log(x))
cameraaccessory_mul_df = cameraaccessory_mul_df.replace([np.inf, -np.inf], 0)
cameraaccessory_mul_df = cameraaccessory_mul_df.replace(np.nan, 0)

gamingaccessory_mul_df = gamingaccessory_mul_df.applymap(lambda x: np.log(x))
gamingaccessory_mul_df = gamingaccessory_mul_df.replace([np.inf, -np.inf], 0)
gamingaccessory_mul_df = gamingaccessory_mul_df.replace(np.nan, 0)

homeaudio_mul_df = homeaudio_mul_df.applymap(lambda x: np.log(x))
homeaudio_mul_df = homeaudio_mul_df.replace([np.inf, -np.inf], 0)
homeaudio_mul_df = homeaudio_mul_df.replace(np.nan, 0)


homeaudio_mul_df.head()
Out[278]:
gmv Discount% deliverybdays deliverycdays sla product_procurement_sla is_cod is_mass_market product_vertical_djcontroller product_vertical_dock product_vertical_dockingstation product_vertical_fmradio product_vertical_hifisystem product_vertical_homeaudiospeaker product_vertical_karaokeplayer product_vertical_slingbox product_vertical_soundmixer product_vertical_voicerecorder payday_week holiday_week Total Investment Total Investment_SMA_3 Total Investment_SMA_5 Total Investment_EMA_8 Total_Investment_Ad_Stock TV TV_SMA_3 TV_SMA_5 TV_EMA_8 TV_Ad_Stock Digital Digital_SMA_3 Digital_SMA_5 Digital_EMA_8 Digital_Ad_Stock Sponsorship Sponsorship_SMA_3 Sponsorship_SMA_5 Sponsorship_EMA_8 Sponsorship_Ad_Stock Content Marketing Content Marketing_SMA_3 Content Marketing_SMA_5 Content Marketing_EMA_8 Content_Marketing_Ad_Stock Online marketing Online marketing_SMA_3 Online marketing_SMA_5 Online marketing_EMA_8 Online_marketing_Ad_Stock Affiliates Affiliates_SMA_3 Affiliates_SMA_5 Affiliates_EMA_8 Affiliates_Ad_Stock SEM SEM_SMA_3 SEM_SMA_5 SEM_EMA_8 SEM_Ad_Stock Radio Radio_SMA_3 Radio_SMA_5 Radio_EMA_8 Radio_Ad_Stock Other Other_SMA_3 Other_SMA_5 Other_EMA_8 Other_Ad_Stock NPS NPS_SMA_3 NPS_SMA_5 Stock Index Stock Index_SMA_3 Stock Index_SMA_5 Max Temp Min Temp Mean Temp Heat Deg Days Cool Deg Days Total Rain (mm) Total Snow (cm) Total Precip (mm) Snow on Grnd (cm) Sale
25 15.336 3.448 0.000 0.000 1.997 1.052 7.367 7.220 2.079 3.497 0.000 6.246 3.135 7.225 0.000 0.000 0.000 4.143 0.000 0.000 1.450 0.000 0.000 1.450 1.450 -2.919 0.000 0.000 -2.919 -2.919 -0.457 0.000 0.000 -0.457 -0.457 0.617 0.000 0.000 0.617 0.617 0.000 0.000 0.000 0.000 0.000 -1.103 0.000 0.000 -1.103 -1.103 -1.988 0.000 0.000 -1.988 -1.988 0.228 0.000 0.000 0.228 0.228 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 4.000 0.000 0.000 7.071 0.000 0.000 3.332 2.526 3.001 -1.261 0.869 1.485 0.000 1.485 0.000 0.000
26 15.497 3.495 0.000 0.000 1.944 1.010 7.533 7.384 1.946 3.912 0.000 6.353 3.738 7.392 0.000 0.000 0.000 4.234 0.000 0.000 1.450 0.000 0.000 1.450 1.920 -2.919 0.000 0.000 -2.919 -2.449 -0.457 0.000 0.000 -0.457 0.013 0.617 0.000 0.000 0.617 1.087 0.000 0.000 0.000 0.000 0.000 -1.103 0.000 0.000 -1.103 -0.633 -1.988 0.000 0.000 -1.988 -1.518 0.228 0.000 0.000 0.228 0.698 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 4.000 0.000 0.000 7.071 0.000 0.000 3.497 2.398 3.143 0.000 1.645 0.336 0.000 0.336 0.000 0.693
27 15.359 3.477 0.000 0.000 1.956 1.051 7.472 7.358 1.386 4.025 0.000 6.358 3.584 7.265 0.000 0.000 0.000 3.829 0.000 0.000 1.450 1.450 0.000 1.450 2.123 -2.919 -2.919 0.000 -2.919 -2.246 -0.457 -0.457 0.000 -0.457 0.216 0.617 0.617 0.000 0.617 1.290 0.000 0.000 0.000 0.000 0.000 -1.103 -1.103 0.000 -1.103 -0.430 -1.988 -1.988 0.000 -1.988 -1.315 0.228 0.228 0.000 0.228 0.901 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 4.000 4.000 0.000 7.071 7.071 0.000 3.450 2.674 3.138 0.000 1.621 0.077 0.000 0.077 0.000 0.000
28 15.054 3.472 0.000 0.000 1.974 1.006 7.126 6.977 0.693 3.761 0.000 6.040 2.996 6.932 0.000 0.000 0.000 3.784 0.000 0.000 1.450 1.450 0.000 1.450 2.228 -2.919 -2.919 0.000 -2.919 -2.141 -0.457 -0.457 0.000 -0.457 0.320 0.617 0.617 0.000 0.617 1.395 0.000 0.000 0.000 0.000 0.000 -1.103 -1.103 0.000 -1.103 -0.325 -1.988 -1.988 0.000 -1.988 -1.210 0.228 0.228 0.000 0.228 1.005 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 4.000 4.000 0.000 7.071 7.071 0.000 3.512 2.773 3.201 0.000 1.882 1.533 0.000 1.533 0.000 0.000
29 7.863 2.781 0.000 0.000 2.197 0.693 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.013 1.157 1.285 1.265 1.884 -6.908 -3.315 -3.137 -3.165 -2.638 -1.363 -0.679 -0.584 -0.599 0.079 -1.546 0.268 0.423 0.398 0.968 0.000 0.000 0.000 0.000 0.000 -3.650 -1.470 -1.306 -1.332 -0.778 -4.200 -2.340 -2.184 -2.208 -1.641 -0.687 0.005 0.100 0.085 0.762 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 4.094 4.032 4.020 7.095 7.079 7.076 3.350 2.708 3.075 0.000 1.295 -1.050 0.000 -1.050 0.000 0.000
In [279]:
# Checking for total count and percentage of null values in all columns of the dataframe.

total = pd.DataFrame(homeaudio_mul_df.isnull().sum().sort_values(ascending=False), columns=['Total'])
percentage = pd.DataFrame(round(100*(homeaudio_mul_df.isnull().sum()/homeaudio_mul_df.shape[0]),2).sort_values(ascending=False)\
                          ,columns=['Percentage'])

pd.concat([total, percentage], axis = 1).head()
Out[279]:
Total Percentage
Sale 0 0.000
Digital_SMA_3 0 0.000
Total Investment_EMA_8 0 0.000
Total_Investment_Ad_Stock 0 0.000
TV 0 0.000

Rescaling the Features of the 3 Dataframes

We will use Standard scaling.

In [280]:
from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()

cameraaccessory_mul_df[cameraaccessory_mul_df.columns]=scaler.fit_transform(cameraaccessory_mul_df[cameraaccessory_mul_df.columns])
gamingaccessory_mul_df[gamingaccessory_mul_df.columns]=scaler.fit_transform(gamingaccessory_mul_df[gamingaccessory_mul_df.columns])
homeaudio_mul_df[homeaudio_mul_df.columns]=scaler.fit_transform(homeaudio_mul_df[homeaudio_mul_df.columns])

homeaudio_mul_df.head()
Out[280]:
gmv Discount% deliverybdays deliverycdays sla product_procurement_sla is_cod is_mass_market product_vertical_djcontroller product_vertical_dock product_vertical_dockingstation product_vertical_fmradio product_vertical_hifisystem product_vertical_homeaudiospeaker product_vertical_karaokeplayer product_vertical_slingbox product_vertical_soundmixer product_vertical_voicerecorder payday_week holiday_week Total Investment Total Investment_SMA_3 Total Investment_SMA_5 Total Investment_EMA_8 Total_Investment_Ad_Stock TV TV_SMA_3 TV_SMA_5 TV_EMA_8 TV_Ad_Stock Digital Digital_SMA_3 Digital_SMA_5 Digital_EMA_8 Digital_Ad_Stock Sponsorship Sponsorship_SMA_3 Sponsorship_SMA_5 Sponsorship_EMA_8 Sponsorship_Ad_Stock Content Marketing Content Marketing_SMA_3 Content Marketing_SMA_5 Content Marketing_EMA_8 Content_Marketing_Ad_Stock Online marketing Online marketing_SMA_3 Online marketing_SMA_5 Online marketing_EMA_8 Online_marketing_Ad_Stock Affiliates Affiliates_SMA_3 Affiliates_SMA_5 Affiliates_EMA_8 Affiliates_Ad_Stock SEM SEM_SMA_3 SEM_SMA_5 SEM_EMA_8 SEM_Ad_Stock Radio Radio_SMA_3 Radio_SMA_5 Radio_EMA_8 Radio_Ad_Stock Other Other_SMA_3 Other_SMA_5 Other_EMA_8 Other_Ad_Stock NPS NPS_SMA_3 NPS_SMA_5 Stock Index Stock Index_SMA_3 Stock Index_SMA_5 Max Temp Min Temp Mean Temp Heat Deg Days Cool Deg Days Total Rain (mm) Total Snow (cm) Total Precip (mm) Snow on Grnd (cm) Sale
25 0.137 -0.777 0.390 0.354 1.799 1.220 0.200 0.103 1.325 0.174 -1.879 0.198 -0.104 0.094 0.000 -0.146 -0.588 0.047 0.000 0.000 -1.455 -3.028 -2.631 -2.055 -2.806 -1.388 0.352 0.279 -2.223 -2.497 0.603 1.095 1.052 0.404 -0.799 -0.892 -1.722 -1.718 -1.778 -2.291 1.425 1.520 1.735 1.889 1.143 -1.727 -1.088 -1.215 -2.260 -2.713 -1.752 -0.043 -0.133 -2.218 -2.757 -0.188 -0.651 -0.678 -0.503 -1.881 0.526 0.471 0.332 0.988 0.747 -0.406 -0.313 -0.167 -0.421 -0.629 1.560 -4.780 -3.311 0.186 -4.791 -3.316 0.679 1.216 0.771 -2.166 0.608 0.841 -0.226 0.751 -0.309 -0.443
26 0.279 -0.465 0.390 0.354 1.428 0.555 0.323 0.251 1.130 0.585 -1.879 0.310 0.794 0.242 0.000 -0.146 -0.588 0.177 0.000 0.000 -1.455 -3.028 -2.631 -2.055 -2.175 -1.388 0.352 0.279 -2.223 -2.139 0.603 1.095 1.052 0.404 -0.107 -0.892 -1.722 -1.718 -1.778 -1.752 1.425 1.520 1.735 1.889 1.143 -1.727 -1.088 -1.215 -2.260 -2.296 -1.752 -0.043 -0.133 -2.218 -2.298 -0.188 -0.651 -0.678 -0.503 -1.059 0.526 0.471 0.332 0.988 0.747 -0.406 -0.313 -0.167 -0.421 -0.629 1.560 -4.780 -3.311 0.186 -4.791 -3.316 0.916 1.102 0.887 -1.163 1.556 -0.261 -0.226 -0.326 -0.309 0.999
27 0.157 -0.588 0.390 0.354 1.514 1.205 0.278 0.228 0.314 0.698 -1.879 0.315 0.564 0.129 0.000 -0.146 -0.588 -0.402 0.000 0.000 -1.455 -1.326 -2.631 -2.055 -1.903 -1.388 -1.892 0.279 -2.223 -1.984 0.603 0.502 1.052 0.404 0.192 -0.892 -1.079 -1.718 -1.778 -1.520 1.425 1.520 1.735 1.889 1.143 -1.727 -2.169 -1.215 -2.260 -2.116 -1.752 -2.280 -0.133 -2.218 -2.100 -0.188 -0.262 -0.678 -0.503 -0.704 0.526 0.471 0.332 0.988 0.747 -0.406 -0.313 -0.167 -0.421 -0.629 1.560 0.350 -3.311 0.186 0.218 -3.316 0.849 1.348 0.882 -1.163 1.526 -0.510 -0.226 -0.569 -0.309 -0.443
28 -0.112 -0.619 0.390 0.354 1.639 0.488 0.022 -0.116 -0.698 0.436 -1.879 -0.017 -0.313 -0.168 0.000 -0.146 -0.588 -0.466 0.000 0.000 -1.455 -1.326 -2.631 -2.055 -1.763 -1.388 -1.892 0.279 -2.223 -1.905 0.603 0.502 1.052 0.404 0.346 -0.892 -1.079 -1.718 -1.778 -1.400 1.425 1.520 1.735 1.889 1.143 -1.727 -2.169 -1.215 -2.260 -2.023 -1.752 -2.280 -0.133 -2.218 -1.998 -0.188 -0.262 -0.678 -0.503 -0.522 0.526 0.471 0.332 0.988 0.747 -0.406 -0.313 -0.167 -0.421 -0.629 1.560 0.350 -3.311 0.186 0.218 -3.316 0.937 1.435 0.934 -1.163 1.844 0.887 -0.226 0.796 -0.309 -0.443
29 -6.479 -5.201 0.390 0.354 3.181 -4.506 -5.245 -6.421 -1.709 -3.284 -1.879 -6.340 -4.779 -6.360 0.000 -0.146 -0.588 -5.874 0.000 0.000 -3.264 -1.670 -1.291 -2.368 -2.224 -3.850 -2.197 -2.380 -2.447 -2.283 -0.502 0.215 0.271 0.163 -0.009 -2.890 -1.443 -1.274 -2.111 -1.889 1.425 1.520 1.735 1.889 1.143 -3.791 -2.529 -2.602 -2.499 -2.425 -3.772 -2.676 -2.897 -2.471 -2.418 -1.660 -0.642 -0.502 -0.803 -0.947 0.526 0.471 0.332 0.988 0.747 -0.406 -0.313 -0.167 -0.421 -0.629 2.912 0.391 0.423 0.530 0.223 0.311 0.705 1.378 0.831 -1.163 1.128 -1.591 -0.226 -1.625 -0.309 -0.443

Splitting the 3 Dataframes into Training and Testing Sets

As you know, the first basic step for regression is performing a train-test split.

In [281]:
from sklearn.model_selection import train_test_split

# We specify this so that the train and test data set always have the same rows, respectively

cameraaccessory_mul_train, cameraaccessory_mul_test = train_test_split(cameraaccessory_mul_df, \
                                                               train_size = 0.7, test_size = 0.3, random_state = 100)

gamingaccessory_mul_train, gamingaccessory_mul_test = train_test_split(gamingaccessory_mul_df, \
                                                               train_size = 0.7, test_size = 0.3, random_state = 100)

homeaudio_mul_train, homeaudio_mul_test = train_test_split(homeaudio_mul_df, \
                                                               train_size = 0.7, test_size = 0.3, random_state = 100)

Dividing the 3 dataframes into X and Y sets for the model building

In [282]:
y_cameraaccessory_mul_train = cameraaccessory_mul_train.pop('gmv')
X_cameraaccessory_mul_train = cameraaccessory_mul_train

y_gamingaccessory_mul_train = gamingaccessory_mul_train.pop('gmv')
X_gamingaccessory_mul_train = gamingaccessory_mul_train

y_homeaudio_mul_train = homeaudio_mul_train.pop('gmv')
X_homeaudio_mul_train = homeaudio_mul_train

X_homeaudio_mul_train.head()
Out[282]:
Discount% deliverybdays deliverycdays sla product_procurement_sla is_cod is_mass_market product_vertical_djcontroller product_vertical_dock product_vertical_dockingstation product_vertical_fmradio product_vertical_hifisystem product_vertical_homeaudiospeaker product_vertical_karaokeplayer product_vertical_slingbox product_vertical_soundmixer product_vertical_voicerecorder payday_week holiday_week Total Investment Total Investment_SMA_3 Total Investment_SMA_5 Total Investment_EMA_8 Total_Investment_Ad_Stock TV TV_SMA_3 TV_SMA_5 TV_EMA_8 TV_Ad_Stock Digital Digital_SMA_3 Digital_SMA_5 Digital_EMA_8 Digital_Ad_Stock Sponsorship Sponsorship_SMA_3 Sponsorship_SMA_5 Sponsorship_EMA_8 Sponsorship_Ad_Stock Content Marketing Content Marketing_SMA_3 Content Marketing_SMA_5 Content Marketing_EMA_8 Content_Marketing_Ad_Stock Online marketing Online marketing_SMA_3 Online marketing_SMA_5 Online marketing_EMA_8 Online_marketing_Ad_Stock Affiliates Affiliates_SMA_3 Affiliates_SMA_5 Affiliates_EMA_8 Affiliates_Ad_Stock SEM SEM_SMA_3 SEM_SMA_5 SEM_EMA_8 SEM_Ad_Stock Radio Radio_SMA_3 Radio_SMA_5 Radio_EMA_8 Radio_Ad_Stock Other Other_SMA_3 Other_SMA_5 Other_EMA_8 Other_Ad_Stock NPS NPS_SMA_3 NPS_SMA_5 Stock Index Stock Index_SMA_3 Stock Index_SMA_5 Max Temp Min Temp Mean Temp Heat Deg Days Cool Deg Days Total Rain (mm) Total Snow (cm) Total Precip (mm) Snow on Grnd (cm) Sale
12 -0.301 0.873 0.920 0.231 0.240 -1.952 -0.056 0.639 0.080 1.001 0.103 0.644 -0.162 0.000 6.856 0.450 0.736 0.000 0.000 0.060 0.566 0.633 0.472 0.468 0.575 0.878 0.915 0.879 0.861 -0.699 -0.022 0.024 -0.316 -0.163 0.205 0.563 0.652 0.318 0.440 -1.764 -0.633 -0.772 -0.800 -1.068 0.316 0.375 0.386 0.452 0.391 0.385 0.421 0.404 0.523 0.461 -0.469 -0.316 -0.288 -0.591 -0.392 0.526 0.320 0.006 -0.944 -0.735 -0.406 1.351 1.565 1.358 1.292 0.812 0.236 0.315 0.944 0.164 0.255 -0.603 -1.033 -1.473 1.077 -0.451 0.236 -2.093 0.357 -0.309 -0.443
10 -0.142 0.082 0.137 -0.030 -0.359 0.084 0.103 1.130 -0.227 -0.072 0.250 0.605 -0.025 0.000 -0.146 -0.588 0.492 0.000 0.000 0.771 0.749 0.432 0.481 0.617 0.934 1.000 0.678 0.836 0.929 0.375 0.259 0.037 -0.158 0.177 0.701 0.718 0.353 0.233 0.525 -0.183 -0.351 -0.475 -0.485 -0.497 0.405 0.410 0.349 0.468 0.416 0.466 0.453 0.337 0.527 0.475 -0.135 -0.206 -0.405 -0.542 -0.311 -2.063 0.352 -0.048 -0.751 -0.213 1.990 2.041 1.135 1.674 1.777 0.013 0.211 0.314 -1.901 0.113 0.272 0.230 -1.033 -0.244 0.830 -0.451 1.862 -0.226 1.749 -0.309 -0.443
32 -0.024 0.390 0.354 0.655 0.258 0.140 0.083 1.130 0.624 -0.323 0.230 0.388 -0.097 0.000 -0.146 -0.588 -0.434 0.000 0.000 0.723 0.252 -0.205 -0.507 0.185 0.394 0.017 -0.522 -0.417 0.062 -0.159 -0.418 -0.607 -0.583 -0.467 1.081 0.732 0.234 0.157 0.708 0.154 -0.278 -0.985 -1.255 -0.527 0.310 -0.100 -0.680 -0.561 -0.059 0.274 -0.232 -0.977 -0.667 -0.124 0.151 -0.337 -0.819 -0.885 -0.373 0.526 0.471 0.332 0.988 0.747 -0.406 -0.313 -0.167 -0.421 -0.629 -0.615 0.269 0.407 -0.754 0.193 0.302 0.653 1.251 0.789 -1.163 0.690 -0.493 -0.226 -0.552 -0.309 -0.443
22 -1.150 0.850 0.894 -0.743 0.282 0.057 -0.083 -0.106 -0.541 0.813 0.101 -1.679 -0.164 0.000 -0.146 2.326 0.679 0.000 0.000 -0.296 -0.080 0.095 -0.071 -0.136 -0.128 -0.416 -0.659 -0.138 -0.282 -0.103 -0.519 -0.887 -1.103 -0.623 0.232 0.194 0.218 0.099 0.159 -1.628 -0.696 -0.515 -0.928 -1.107 -0.274 -0.042 0.156 0.223 0.044 -0.273 -0.143 -0.020 0.189 0.009 -0.665 -0.514 -0.355 -0.668 -0.532 0.526 0.267 -0.046 -1.228 -1.297 -0.406 -2.188 -0.935 -1.438 -0.804 0.444 0.222 0.296 0.389 0.235 0.324 0.916 0.872 0.818 -1.843 1.153 -0.493 -0.226 -0.552 -0.309 -0.443
45 0.638 0.390 0.354 0.289 0.937 0.328 0.046 0.639 0.389 0.588 0.087 0.341 0.256 0.000 -0.146 -0.588 -0.071 0.000 0.000 0.853 0.555 0.313 0.603 0.601 0.598 0.479 0.317 0.524 0.551 0.836 0.423 0.012 0.757 0.596 0.987 0.718 0.382 0.753 0.754 0.531 0.209 -0.273 0.194 0.342 0.567 0.501 0.406 0.521 0.520 0.552 0.470 0.358 0.523 0.519 1.099 0.703 0.295 0.942 0.805 0.526 0.471 0.332 0.988 0.747 -0.406 -0.313 -0.167 -0.421 -0.629 -0.963 0.135 0.256 -1.585 0.172 0.298 -1.666 -1.033 -1.654 1.139 -0.451 0.618 -0.226 0.533 -0.309 -0.443

Dividing into X and Y test sets for the model building for 3 dataframes

In [283]:
y_cameraaccessory_mul_test = cameraaccessory_mul_test.pop('gmv')
X_cameraaccessory_mul_test = cameraaccessory_mul_test

y_gamingaccessory_mul_test = gamingaccessory_mul_test.pop('gmv')
X_gamingaccessory_mul_test = gamingaccessory_mul_test

y_homeaudio_mul_test = homeaudio_mul_test.pop('gmv')
X_homeaudio_mul_test = homeaudio_mul_test

X_homeaudio_mul_test.head()
Out[283]:
Discount% deliverybdays deliverycdays sla product_procurement_sla is_cod is_mass_market product_vertical_djcontroller product_vertical_dock product_vertical_dockingstation product_vertical_fmradio product_vertical_hifisystem product_vertical_homeaudiospeaker product_vertical_karaokeplayer product_vertical_slingbox product_vertical_soundmixer product_vertical_voicerecorder payday_week holiday_week Total Investment Total Investment_SMA_3 Total Investment_SMA_5 Total Investment_EMA_8 Total_Investment_Ad_Stock TV TV_SMA_3 TV_SMA_5 TV_EMA_8 TV_Ad_Stock Digital Digital_SMA_3 Digital_SMA_5 Digital_EMA_8 Digital_Ad_Stock Sponsorship Sponsorship_SMA_3 Sponsorship_SMA_5 Sponsorship_EMA_8 Sponsorship_Ad_Stock Content Marketing Content Marketing_SMA_3 Content Marketing_SMA_5 Content Marketing_EMA_8 Content_Marketing_Ad_Stock Online marketing Online marketing_SMA_3 Online marketing_SMA_5 Online marketing_EMA_8 Online_marketing_Ad_Stock Affiliates Affiliates_SMA_3 Affiliates_SMA_5 Affiliates_EMA_8 Affiliates_Ad_Stock SEM SEM_SMA_3 SEM_SMA_5 SEM_EMA_8 SEM_Ad_Stock Radio Radio_SMA_3 Radio_SMA_5 Radio_EMA_8 Radio_Ad_Stock Other Other_SMA_3 Other_SMA_5 Other_EMA_8 Other_Ad_Stock NPS NPS_SMA_3 NPS_SMA_5 Stock Index Stock Index_SMA_3 Stock Index_SMA_5 Max Temp Min Temp Mean Temp Heat Deg Days Cool Deg Days Total Rain (mm) Total Snow (cm) Total Precip (mm) Snow on Grnd (cm) Sale
31 0.070 0.390 0.354 -0.128 1.406 0.233 0.192 1.130 0.643 -0.646 0.372 0.522 -0.016 0.000 -0.146 -0.588 0.406 0.000 0.000 0.723 -0.490 -0.830 -1.254 -0.385 0.394 -0.515 -1.107 -0.912 -0.293 -0.159 -0.539 -0.686 -0.563 -0.535 1.081 0.031 -0.460 -0.601 0.191 0.154 -0.824 -1.660 -1.900 -1.085 0.310 -0.770 -1.400 -1.109 -0.467 0.274 -0.993 -1.818 -1.247 -0.566 0.151 -0.923 -1.270 -1.195 -0.767 0.526 0.471 0.332 0.988 0.747 -0.406 -0.313 -0.167 -0.421 -0.629 -0.615 0.374 0.451 -0.754 0.214 0.312 0.894 0.923 0.736 -0.967 0.676 -0.623 -0.226 -0.679 -0.309 -0.443
5 1.152 -1.640 -1.664 -0.743 -0.264 0.549 0.410 -1.709 0.624 -0.323 0.212 -0.240 0.592 0.000 -0.146 -0.588 0.475 0.000 0.000 -0.433 -0.373 0.059 0.019 -0.219 0.006 -0.156 0.033 0.238 0.091 -0.010 -0.150 -0.672 -0.506 -0.357 -0.675 -0.834 -1.084 -0.794 -0.947 -0.016 -0.157 -0.041 -0.088 -0.082 0.286 0.266 0.424 0.508 0.385 0.298 0.246 0.391 0.537 0.408 -0.595 -0.694 -0.669 -0.463 -0.616 0.526 -2.424 0.087 -0.599 -0.641 -0.406 -0.313 1.321 1.071 0.706 0.390 0.245 0.305 0.715 0.244 0.298 0.002 -1.033 -0.327 0.865 -0.451 -0.584 -0.226 -0.641 0.407 1.843
9 0.149 -0.525 -0.529 -0.399 -0.515 0.308 0.165 -1.709 0.080 0.911 0.194 -0.104 0.215 0.000 -0.146 1.057 0.475 0.000 0.000 0.771 0.479 0.246 0.327 0.481 0.934 0.770 0.461 0.697 0.822 0.375 0.137 -0.046 -0.235 0.107 0.701 0.407 0.086 -0.000 0.361 -0.183 -0.281 -0.427 -0.415 -0.440 0.405 0.364 0.318 0.462 0.401 0.466 0.388 0.291 0.513 0.454 -0.135 -0.354 -0.504 -0.574 -0.374 -2.063 0.320 -0.124 -0.828 -0.392 1.990 1.351 0.529 1.358 1.580 0.013 0.223 0.319 -1.901 0.159 0.291 -0.127 -1.033 0.113 0.592 -0.451 -1.249 -0.226 -1.291 -0.309 -0.443
3 1.221 -2.009 -2.124 -0.995 0.915 0.608 0.384 0.314 0.012 -0.323 0.259 -0.240 0.640 0.000 -0.146 -0.588 0.458 0.000 0.000 -0.433 0.193 0.310 0.358 0.153 0.006 0.273 0.261 0.430 0.324 -0.010 -0.969 -1.334 -0.545 -0.709 -0.675 -1.312 -1.435 -0.492 -0.904 -0.016 0.210 0.186 0.116 0.222 0.286 0.517 0.572 0.607 0.509 0.298 0.527 0.562 0.638 0.539 -0.595 -0.609 -0.617 -0.160 -0.444 0.526 0.408 0.217 -0.101 0.594 -0.406 2.253 2.357 2.137 1.884 0.390 0.189 0.280 0.715 0.175 0.267 -0.426 -1.033 -0.714 0.979 -0.451 0.590 -0.226 0.506 -0.309 0.999
18 -0.253 0.839 0.882 -0.785 0.127 0.060 -0.061 1.130 -3.284 1.161 0.015 -0.844 -0.156 0.000 -0.146 1.488 0.831 0.000 0.000 0.178 0.196 0.197 0.178 0.161 -0.386 -0.644 -0.042 0.264 -0.078 -1.123 -1.331 -1.243 -1.331 -1.336 0.243 0.202 0.205 0.130 0.173 0.176 0.067 -0.523 -0.626 -0.206 0.426 0.435 0.381 0.469 0.437 0.344 0.303 0.280 0.457 0.387 -0.034 -0.099 -0.286 -0.495 -0.196 -2.048 0.353 -0.046 -0.904 -0.232 -0.411 -0.318 -0.935 -0.338 0.281 -0.498 0.166 0.306 0.784 0.248 0.332 0.265 0.201 0.381 0.206 -0.451 -1.739 -0.226 -1.770 -0.309 -0.443

Building Linear Regression model for cameraaccessory

In [284]:
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
from sklearn.metrics import mean_squared_error

cameraaccessory_mul_model = LinearRegression().fit(X_cameraaccessory_mul_train, y_cameraaccessory_mul_train)
y_cameraaccessory_mul_test_pred = cameraaccessory_mul_model.predict(X_cameraaccessory_mul_test)

print('R2 Score: {}'.format(r2_score(y_cameraaccessory_mul_test, y_cameraaccessory_mul_test_pred)))
print('Mean Squared Error: {}'.format(mean_squared_error(y_cameraaccessory_mul_test, y_cameraaccessory_mul_test_pred)))
R2 Score: 0.8395078516371086
Mean Squared Error: 0.35593405479143736
With Simple Linear Regression, we get a r2 score of 0.84 and mse of 0.36

Building Linear Regression model for cameraaccessory using K-fold Cross Validation

We will use GridSearchCV method and 5 fold cross validation method for our linear regression.

In [285]:
y_cameraaccessory_mul = cameraaccessory_mul_df.pop('gmv')
X_cameraaccessory_mul = cameraaccessory_mul_df
In [286]:
# Make cross validated predictions
from sklearn.model_selection import cross_val_score,cross_val_predict
from sklearn import metrics

cameraaccessory_mul_model_cv = LinearRegression().fit(X_cameraaccessory_mul, y_cameraaccessory_mul)
cameraaccessory_mul_predictions_cv = cross_val_predict(cameraaccessory_mul_model_cv, X_cameraaccessory_mul, \
                                                       y_cameraaccessory_mul, cv=10)
accuracy = metrics.r2_score(y_cameraaccessory_mul, cameraaccessory_mul_predictions_cv)
print("Cross-Predicted Accuracy:", accuracy)
print('Mean Squared Error: {}'.format(mean_squared_error(y_cameraaccessory_mul, cameraaccessory_mul_predictions_cv)))
Cross-Predicted Accuracy: 0.90863128038612
Mean Squared Error: 0.09136871961388007
With Simple Linear Regression, using cross validation, we get a r2 score of 0.91 and mse score of 0.09

Determining Feature Importance for cameraaccessory from model with cv

In [287]:
# linear regression model parameters
#Limiting floats output to 3 decimal points
pd.set_option('display.float_format', lambda x: '{:.3f}'.format(x)) 
pd.set_option('display.precision',1)


cameraaccessory_mul_lr_model_parameters = list(cameraaccessory_mul_model_cv.coef_)
cameraaccessory_mul_lr_model_parameters.insert(0, cameraaccessory_mul_model_cv.intercept_)
cameraaccessory_mul_lr_model_parameters = [round(x, 3) for x in cameraaccessory_mul_lr_model_parameters]
cols = X_cameraaccessory_test.columns
cols = cols.insert(0, "constant")
cameraaccessory_mul_lr_coef = list(zip(cols, cameraaccessory_mul_lr_model_parameters))
cameraaccessory_mul_lr_coef
Out[287]:
[('constant', 0.0),
 ('Discount%', -0.107),
 ('deliverybdays', -0.044),
 ('deliverycdays', 0.059),
 ('sla', -0.086),
 ('product_procurement_sla', 0.042),
 ('is_cod', -0.025),
 ('is_mass_market', 0.149),
 ('product_vertical_cameraaccessory', 0.07),
 ('product_vertical_camerabag', 0.051),
 ('product_vertical_camerabattery', 0.16),
 ('product_vertical_camerabatterycharger', 0.121),
 ('product_vertical_camerabatterygrip', -0.039),
 ('product_vertical_cameraeyecup', 0.057),
 ('product_vertical_camerafilmrolls', -0.059),
 ('product_vertical_camerahousing', -0.006),
 ('product_vertical_cameraledlight', 0.0),
 ('product_vertical_cameramicrophone', -0.066),
 ('product_vertical_cameramount', 0.055),
 ('product_vertical_cameraremotecontrol', 0.071),
 ('product_vertical_cameratripod', 0.101),
 ('product_vertical_extensiontube', 0.03),
 ('product_vertical_filter', 0.032),
 ('product_vertical_flash', -0.042),
 ('product_vertical_flashshoeadapter', -0.0),
 ('product_vertical_lens', 0.181),
 ('product_vertical_reflectorumbrella', 0.006),
 ('product_vertical_softbox', 0.007),
 ('product_vertical_strap', 0.012),
 ('product_vertical_teleconverter', 0.0),
 ('product_vertical_telescope', -0.011),
 ('payday_week', -0.0),
 ('holiday_week', 0.0),
 ('Total Investment', 0.025),
 ('Total Investment_SMA_3', -0.003),
 ('Total Investment_SMA_5', 0.001),
 ('Total Investment_EMA_8', 0.036),
 ('Total_Investment_Ad_Stock', -0.018),
 ('TV', 0.105),
 ('TV_SMA_3', -0.163),
 ('TV_SMA_5', 0.063),
 ('TV_EMA_8', 0.09),
 ('TV_Ad_Stock', 0.003),
 ('Digital', -0.006),
 ('Digital_SMA_3', -0.032),
 ('Digital_SMA_5', 0.027),
 ('Digital_EMA_8', 0.006),
 ('Digital_Ad_Stock', -0.026),
 ('Sponsorship', 0.009),
 ('Sponsorship_SMA_3', -0.005),
 ('Sponsorship_SMA_5', -0.022),
 ('Sponsorship_EMA_8', 0.013),
 ('Sponsorship_Ad_Stock', -0.021),
 ('Content Marketing', -0.119),
 ('Content Marketing_SMA_3', 0.073),
 ('Content Marketing_SMA_5', -0.002),
 ('Content Marketing_EMA_8', 0.024),
 ('Content_Marketing_Ad_Stock', 0.096),
 ('Online marketing', 0.074),
 ('Online marketing_SMA_3', -0.027),
 ('Online marketing_SMA_5', -0.059),
 ('Online marketing_EMA_8', 0.04),
 ('Online_marketing_Ad_Stock', -0.018),
 ('Affiliates', 0.083),
 ('Affiliates_SMA_3', -0.023),
 ('Affiliates_SMA_5', -0.032),
 ('Affiliates_EMA_8', 0.048),
 ('Affiliates_Ad_Stock', -0.014),
 ('SEM', 0.008),
 ('SEM_SMA_3', 0.037),
 ('SEM_SMA_5', -0.051),
 ('SEM_EMA_8', -0.019),
 ('SEM_Ad_Stock', -0.028),
 ('Radio', 0.039),
 ('Radio_SMA_3', -0.003),
 ('Radio_SMA_5', -0.005),
 ('Radio_EMA_8', 0.021),
 ('Radio_Ad_Stock', -0.0),
 ('Other', -0.015),
 ('Other_SMA_3', 0.063),
 ('Other_SMA_5', -0.126),
 ('Other_EMA_8', 0.036),
 ('Other_Ad_Stock', 0.011),
 ('NPS', -0.031),
 ('NPS_SMA_3', -0.031),
 ('NPS_SMA_5', 0.011),
 ('Stock Index', 0.001),
 ('Stock Index_SMA_3', -0.034),
 ('Stock Index_SMA_5', 0.008),
 ('Max Temp', -0.023),
 ('Min Temp', -0.019),
 ('Mean Temp', 0.053),
 ('Heat Deg Days', -0.02),
 ('Cool Deg Days', 0.043),
 ('Total Rain (mm)', -0.061),
 ('Total Snow (cm)', -0.038),
 ('Total Precip (mm)', 0.105),
 ('Snow on Grnd (cm)', 0.006),
 ('Sale', -0.022)]
In [288]:
cameraaccessory_mul_lr_coef_df = pd.DataFrame(cameraaccessory_mul_lr_coef)
col_rename = {0:'Features',1: 'Coefficients'}
cameraaccessory_mul_lr_coef_df = cameraaccessory_mul_lr_coef_df.rename(columns=col_rename)
cameraaccessory_mul_lr_coef_df = cameraaccessory_mul_lr_coef_df.iloc[1:,:]
cameraaccessory_mul_lr_coef_df = cameraaccessory_mul_lr_coef_df.loc[cameraaccessory_mul_lr_coef_df['Coefficients']!=0.0]
cameraaccessory_mul_lr_coef_df = cameraaccessory_mul_lr_coef_df.sort_values(by=['Coefficients'], ascending = False)
cameraaccessory_mul_lr_coef_df
Out[288]:
Features Coefficients
25 product_vertical_lens 0.181
10 product_vertical_camerabattery 0.160
7 is_mass_market 0.149
11 product_vertical_camerabatterycharger 0.121
38 TV 0.105
96 Total Precip (mm) 0.105
20 product_vertical_cameratripod 0.101
57 Content_Marketing_Ad_Stock 0.096
41 TV_EMA_8 0.090
63 Affiliates 0.083
58 Online marketing 0.074
54 Content Marketing_SMA_3 0.073
19 product_vertical_cameraremotecontrol 0.071
8 product_vertical_cameraaccessory 0.070
40 TV_SMA_5 0.063
79 Other_SMA_3 0.063
3 deliverycdays 0.059
13 product_vertical_cameraeyecup 0.057
18 product_vertical_cameramount 0.055
91 Mean Temp 0.053
9 product_vertical_camerabag 0.051
66 Affiliates_EMA_8 0.048
93 Cool Deg Days 0.043
5 product_procurement_sla 0.042
61 Online marketing_EMA_8 0.040
73 Radio 0.039
69 SEM_SMA_3 0.037
36 Total Investment_EMA_8 0.036
81 Other_EMA_8 0.036
22 product_vertical_filter 0.032
21 product_vertical_extensiontube 0.030
45 Digital_SMA_5 0.027
33 Total Investment 0.025
56 Content Marketing_EMA_8 0.024
76 Radio_EMA_8 0.021
51 Sponsorship_EMA_8 0.013
28 product_vertical_strap 0.012
82 Other_Ad_Stock 0.011
85 NPS_SMA_5 0.011
48 Sponsorship 0.009
68 SEM 0.008
88 Stock Index_SMA_5 0.008
27 product_vertical_softbox 0.007
26 product_vertical_reflectorumbrella 0.006
97 Snow on Grnd (cm) 0.006
46 Digital_EMA_8 0.006
42 TV_Ad_Stock 0.003
35 Total Investment_SMA_5 0.001
86 Stock Index 0.001
55 Content Marketing_SMA_5 -0.002
74 Radio_SMA_3 -0.003
34 Total Investment_SMA_3 -0.003
49 Sponsorship_SMA_3 -0.005
75 Radio_SMA_5 -0.005
43 Digital -0.006
15 product_vertical_camerahousing -0.006
30 product_vertical_telescope -0.011
67 Affiliates_Ad_Stock -0.014
78 Other -0.015
62 Online_marketing_Ad_Stock -0.018
37 Total_Investment_Ad_Stock -0.018
90 Min Temp -0.019
71 SEM_EMA_8 -0.019
92 Heat Deg Days -0.020
52 Sponsorship_Ad_Stock -0.021
98 Sale -0.022
50 Sponsorship_SMA_5 -0.022
64 Affiliates_SMA_3 -0.023
89 Max Temp -0.023
6 is_cod -0.025
47 Digital_Ad_Stock -0.026
59 Online marketing_SMA_3 -0.027
72 SEM_Ad_Stock -0.028
84 NPS_SMA_3 -0.031
83 NPS -0.031
44 Digital_SMA_3 -0.032
65 Affiliates_SMA_5 -0.032
87 Stock Index_SMA_3 -0.034
95 Total Snow (cm) -0.038
12 product_vertical_camerabatterygrip -0.039
23 product_vertical_flash -0.042
2 deliverybdays -0.044
70 SEM_SMA_5 -0.051
60 Online marketing_SMA_5 -0.059
14 product_vertical_camerafilmrolls -0.059
94 Total Rain (mm) -0.061
17 product_vertical_cameramicrophone -0.066
4 sla -0.086
1 Discount% -0.107
53 Content Marketing -0.119
80 Other_SMA_5 -0.126
39 TV_SMA_3 -0.163

Plotting the Features in descending order of Importance for cameraaccessory

In [289]:
# Slightly alter the figure size to make it more horizontal.
plt.figure(figsize=(10, 15), dpi=100, facecolor='w', edgecolor='k', frameon='True')
sns.barplot(y='Features', x='Coefficients', palette='husl', data=cameraaccessory_mul_lr_coef_df, estimator=np.sum)
# Automatically adjust subplot params so that the subplotS fits in to the figure area.
plt.tight_layout()

# display the plot
plt.show()
The 5 most important features affecting GMV(Revenue) for cameraaccessory are:
Features Coefficients
product_vertical_lens 0.181
product_vertical_camerabattery 0.160
is_mass_market 0.149
product_vertical_camerabatterycharger 0.121
TV 0.105

Building Linear Regression model for gamingaccessory

In [290]:
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
from sklearn.metrics import mean_squared_error

gamingaccessory_mul_model = LinearRegression().fit(X_gamingaccessory_mul_train, y_gamingaccessory_mul_train)
y_gamingaccessory_mul_test_pred = gamingaccessory_mul_model.predict(X_gamingaccessory_mul_test)

print('R2 Score: {}'.format(r2_score(y_gamingaccessory_mul_test, y_gamingaccessory_mul_test_pred)))
print('Mean Squared Error: {}'.format(mean_squared_error(y_gamingaccessory_mul_test, y_gamingaccessory_mul_test_pred)))
R2 Score: 0.9431103703903922
Mean Squared Error: 0.0894006867785758
With Simple Linear Regression, we get a r2 score of 0.94 and mse of 0.09

Building Linear Regression model for gamingaccessory using K-fold Cross Validation

We will use GridSearchCV method and 5 fold cross validation method for our linear regression.

In [291]:
y_gamingaccessory_mul = gamingaccessory_mul_df.pop('gmv')
X_gamingaccessory_mul = gamingaccessory_mul_df
In [292]:
# Make cross validated predictions
from sklearn.model_selection import cross_val_score,cross_val_predict
from sklearn import metrics

gamingaccessory_mul_model_cv = LinearRegression().fit(X_gamingaccessory_mul, y_gamingaccessory_mul)
gamingaccessory_mul_predictions_cv = cross_val_predict(gamingaccessory_mul_model_cv, X_gamingaccessory_mul, \
                                                       y_gamingaccessory_mul, cv=10)
accuracy = metrics.r2_score(y_gamingaccessory_mul, gamingaccessory_mul_predictions_cv)
print("Cross-Predicted Accuracy:", accuracy)
print('Mean Squared Error: {}'.format(mean_squared_error(y_gamingaccessory_mul, gamingaccessory_mul_predictions_cv)))
Cross-Predicted Accuracy: 0.9392349576865133
Mean Squared Error: 0.060765042313486624
With Simple Linear Regression, using cross validation, we get a r2 score of 0.94 and mse score of 0.06

Determining Feature Importance for gamingaccessory with model with cv

In [293]:
# linear regression model parameters
#Limiting floats output to 3 decimal points
pd.set_option('display.float_format', lambda x: '{:.3f}'.format(x)) 
pd.set_option('display.precision',1)


gamingaccessory_mul_lr_model_parameters = list(gamingaccessory_mul_model_cv.coef_)
gamingaccessory_mul_lr_model_parameters.insert(0, gamingaccessory_mul_model_cv.intercept_)
gamingaccessory_mul_lr_model_parameters = [round(x, 3) for x in gamingaccessory_mul_lr_model_parameters]
cols = X_gamingaccessory_test.columns
cols = cols.insert(0, "constant")
gamingaccessory_mul_lr_coef = list(zip(cols, gamingaccessory_mul_lr_model_parameters))
gamingaccessory_mul_lr_coef
Out[293]:
[('constant', 0.0),
 ('Discount%', -0.088),
 ('deliverybdays', 0.016),
 ('deliverycdays', -0.013),
 ('sla', 0.041),
 ('product_procurement_sla', -0.009),
 ('is_cod', -0.04),
 ('is_mass_market', 0.234),
 ('product_vertical_gamecontrolmount', 0.0),
 ('product_vertical_gamepad', 0.211),
 ('product_vertical_gamingaccessorykit', -0.041),
 ('product_vertical_gamingadapter', -0.056),
 ('product_vertical_gamingchargingstation', 0.06),
 ('product_vertical_gamingheadset', 0.25),
 ('product_vertical_gamingkeyboard', -0.027),
 ('product_vertical_gamingmemorycard', -0.127),
 ('product_vertical_gamingmouse', 0.224),
 ('product_vertical_gamingmousepad', 0.044),
 ('product_vertical_gamingspeaker', -0.008),
 ('product_vertical_joystickgamingwheel', 0.136),
 ('product_vertical_motioncontroller', 0.088),
 ('product_vertical_tvoutcableaccessory', -0.006),
 ('payday_week', 0.0),
 ('holiday_week', 0.0),
 ('Total Investment', 0.002),
 ('Total Investment_SMA_3', -0.053),
 ('Total Investment_SMA_5', -0.068),
 ('Total Investment_EMA_8', -0.077),
 ('Total_Investment_Ad_Stock', -0.071),
 ('TV', 0.027),
 ('TV_SMA_3', 0.062),
 ('TV_SMA_5', -0.008),
 ('TV_EMA_8', -0.031),
 ('TV_Ad_Stock', -0.057),
 ('Digital', -0.022),
 ('Digital_SMA_3', -0.061),
 ('Digital_SMA_5', 0.022),
 ('Digital_EMA_8', 0.054),
 ('Digital_Ad_Stock', 0.052),
 ('Sponsorship', -0.029),
 ('Sponsorship_SMA_3', 0.118),
 ('Sponsorship_SMA_5', -0.055),
 ('Sponsorship_EMA_8', -0.09),
 ('Sponsorship_Ad_Stock', -0.021),
 ('Content Marketing', -0.06),
 ('Content Marketing_SMA_3', 0.012),
 ('Content Marketing_SMA_5', -0.049),
 ('Content Marketing_EMA_8', -0.029),
 ('Content_Marketing_Ad_Stock', 0.133),
 ('Online marketing', 0.047),
 ('Online marketing_SMA_3', 0.157),
 ('Online marketing_SMA_5', -0.015),
 ('Online marketing_EMA_8', -0.004),
 ('Online_marketing_Ad_Stock', 0.016),
 ('Affiliates', 0.049),
 ('Affiliates_SMA_3', 0.093),
 ('Affiliates_SMA_5', -0.021),
 ('Affiliates_EMA_8', -0.013),
 ('Affiliates_Ad_Stock', -0.004),
 ('SEM', 0.016),
 ('SEM_SMA_3', -0.004),
 ('SEM_SMA_5', -0.038),
 ('SEM_EMA_8', 0.029),
 ('SEM_Ad_Stock', 0.035),
 ('Radio', 0.033),
 ('Radio_SMA_3', -0.021),
 ('Radio_SMA_5', 0.021),
 ('Radio_EMA_8', -0.084),
 ('Radio_Ad_Stock', 0.053),
 ('Other', -0.02),
 ('Other_SMA_3', -0.005),
 ('Other_SMA_5', 0.036),
 ('Other_EMA_8', -0.084),
 ('Other_Ad_Stock', 0.086),
 ('NPS', -0.016),
 ('NPS_SMA_3', 0.017),
 ('NPS_SMA_5', 0.008),
 ('Stock Index', -0.035),
 ('Stock Index_SMA_3', 0.017),
 ('Stock Index_SMA_5', 0.008),
 ('Max Temp', 0.005),
 ('Min Temp', 0.047),
 ('Mean Temp', 0.03),
 ('Heat Deg Days', 0.066),
 ('Cool Deg Days', -0.005),
 ('Total Rain (mm)', -0.019),
 ('Total Snow (cm)', 0.022),
 ('Total Precip (mm)', 0.019),
 ('Snow on Grnd (cm)', -0.02),
 ('Sale', -0.001)]
In [294]:
gamingaccessory_mul_lr_coef_df = pd.DataFrame(gamingaccessory_mul_lr_coef)
col_rename = {0:'Features',1: 'Coefficients'}
gamingaccessory_mul_lr_coef_df = gamingaccessory_mul_lr_coef_df.rename(columns=col_rename)
gamingaccessory_mul_lr_coef_df = gamingaccessory_mul_lr_coef_df.iloc[1:,:]
gamingaccessory_mul_lr_coef_df = gamingaccessory_mul_lr_coef_df.loc[gamingaccessory_mul_lr_coef_df['Coefficients']!=0.0]
gamingaccessory_mul_lr_coef_df = gamingaccessory_mul_lr_coef_df.sort_values(by=['Coefficients'], ascending = False)
gamingaccessory_mul_lr_coef_df
Out[294]:
Features Coefficients
13 product_vertical_gamingheadset 0.250
7 is_mass_market 0.234
16 product_vertical_gamingmouse 0.224
9 product_vertical_gamepad 0.211
50 Online marketing_SMA_3 0.157
19 product_vertical_joystickgamingwheel 0.136
48 Content_Marketing_Ad_Stock 0.133
40 Sponsorship_SMA_3 0.118
55 Affiliates_SMA_3 0.093
20 product_vertical_motioncontroller 0.088
73 Other_Ad_Stock 0.086
83 Heat Deg Days 0.066
30 TV_SMA_3 0.062
12 product_vertical_gamingchargingstation 0.060
37 Digital_EMA_8 0.054
68 Radio_Ad_Stock 0.053
38 Digital_Ad_Stock 0.052
54 Affiliates 0.049
49 Online marketing 0.047
81 Min Temp 0.047
17 product_vertical_gamingmousepad 0.044
4 sla 0.041
71 Other_SMA_5 0.036
63 SEM_Ad_Stock 0.035
64 Radio 0.033
82 Mean Temp 0.030
62 SEM_EMA_8 0.029
29 TV 0.027
36 Digital_SMA_5 0.022
86 Total Snow (cm) 0.022
66 Radio_SMA_5 0.021
87 Total Precip (mm) 0.019
75 NPS_SMA_3 0.017
78 Stock Index_SMA_3 0.017
2 deliverybdays 0.016
53 Online_marketing_Ad_Stock 0.016
59 SEM 0.016
45 Content Marketing_SMA_3 0.012
76 NPS_SMA_5 0.008
79 Stock Index_SMA_5 0.008
80 Max Temp 0.005
24 Total Investment 0.002
89 Sale -0.001
52 Online marketing_EMA_8 -0.004
58 Affiliates_Ad_Stock -0.004
60 SEM_SMA_3 -0.004
84 Cool Deg Days -0.005
70 Other_SMA_3 -0.005
21 product_vertical_tvoutcableaccessory -0.006
31 TV_SMA_5 -0.008
18 product_vertical_gamingspeaker -0.008
5 product_procurement_sla -0.009
57 Affiliates_EMA_8 -0.013
3 deliverycdays -0.013
51 Online marketing_SMA_5 -0.015
74 NPS -0.016
85 Total Rain (mm) -0.019
88 Snow on Grnd (cm) -0.020
69 Other -0.020
56 Affiliates_SMA_5 -0.021
65 Radio_SMA_3 -0.021
43 Sponsorship_Ad_Stock -0.021
34 Digital -0.022
14 product_vertical_gamingkeyboard -0.027
47 Content Marketing_EMA_8 -0.029
39 Sponsorship -0.029
32 TV_EMA_8 -0.031
77 Stock Index -0.035
61 SEM_SMA_5 -0.038
6 is_cod -0.040
10 product_vertical_gamingaccessorykit -0.041
46 Content Marketing_SMA_5 -0.049
25 Total Investment_SMA_3 -0.053
41 Sponsorship_SMA_5 -0.055
11 product_vertical_gamingadapter -0.056
33 TV_Ad_Stock -0.057
44 Content Marketing -0.060
35 Digital_SMA_3 -0.061
26 Total Investment_SMA_5 -0.068
28 Total_Investment_Ad_Stock -0.071
27 Total Investment_EMA_8 -0.077
67 Radio_EMA_8 -0.084
72 Other_EMA_8 -0.084
1 Discount% -0.088
42 Sponsorship_EMA_8 -0.090
15 product_vertical_gamingmemorycard -0.127

Plotting the Features in descending order of Importance for gamingaccessory

In [295]:
# Slightly alter the figure size to make it more horizontal.
plt.figure(figsize=(10, 15), dpi=100, facecolor='w', edgecolor='k', frameon='True')
sns.barplot(y='Features', x='Coefficients', palette='husl', data=gamingaccessory_mul_lr_coef_df, estimator=np.sum)
# Automatically adjust subplot params so that the subplotS fits in to the figure area.
plt.tight_layout()

# display the plot
plt.show()
The 5 most important features affecting GMV(Revenue) for gamingaccessory are:
Features Coefficients
product_vertical_gamingheadset 0.250
is_mass_market 0.234
product_vertical_gamingmouse 0.224
product_vertical_gamepad 0.211
Online marketing_SMA_3 0.157

Building Linear Regression model for homeaudio

In [296]:
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
from sklearn.metrics import mean_squared_error

homeaudio_mul_model = LinearRegression().fit(X_homeaudio_mul_train, y_homeaudio_mul_train)
y_homeaudio_mul_test_pred = homeaudio_mul_model.predict(X_homeaudio_mul_test)

print('R2 Score: {}'.format(r2_score(y_homeaudio_mul_test, y_homeaudio_mul_test_pred)))
print('Mean Squared Error: {}'.format(mean_squared_error(y_homeaudio_mul_test, y_homeaudio_mul_test_pred)))
R2 Score: -0.6349683688775178
Mean Squared Error: 0.3421291055323323
With Simple Linear Regression, we get a r2 score of -0.63 and mse of 0.34

Here R2 is negative which signifies that the chosen model does not follow the trend of the data, so fits worse than a horizontal line. It simply means the chosen model (with its constraints) fits the data really poorly.

Building Linear Regression model for homeaudio using K-fold Cross Validation

We will use GridSearchCV method and 5 fold cross validation method for our linear regression.

In [297]:
y_homeaudio_mul = homeaudio_mul_df.pop('gmv')
X_homeaudio_mul = homeaudio_mul_df
In [298]:
# Make cross validated predictions
from sklearn.model_selection import cross_val_score,cross_val_predict
from sklearn import metrics

homeaudio_mul_model_cv = LinearRegression().fit(X_homeaudio_mul, y_homeaudio_mul)
homeaudio_mul_predictions_cv = cross_val_predict(homeaudio_mul_model_cv, X_homeaudio_mul, y_homeaudio_mul, cv=5)
accuracy = metrics.r2_score(y_homeaudio_mul, homeaudio_mul_predictions_cv)
print("Cross-Predicted Accuracy:", accuracy)
print('Mean Squared Error: {}'.format(mean_squared_error(y_homeaudio_mul, homeaudio_mul_predictions_cv)))
Cross-Predicted Accuracy: 0.8611790689926854
Mean Squared Error: 0.13882093100731463
With Simple Linear Regression, using cross validation, we get a r2 score of 0.86 and mse score of 0.14

Determining Feature Importance for homeaudio with model with cv

In [299]:
# linear regression model parameters
#Limiting floats output to 3 decimal points
pd.set_option('display.float_format', lambda x: '{:.3f}'.format(x)) 
pd.set_option('display.precision',1)


homeaudio_mul_lr_model_parameters = list(homeaudio_mul_model_cv.coef_)
homeaudio_mul_lr_model_parameters.insert(0, homeaudio_mul_model_cv.intercept_)
homeaudio_mul_lr_model_parameters = [round(x, 3) for x in homeaudio_mul_lr_model_parameters]
cols = X_homeaudio_test.columns
cols = cols.insert(0, "constant")
homeaudio_mul_lr_coef = list(zip(cols, homeaudio_mul_lr_model_parameters))
homeaudio_mul_lr_coef
Out[299]:
[('constant', -0.0),
 ('Discount%', 0.092),
 ('deliverybdays', 0.036),
 ('deliverycdays', -0.079),
 ('sla', -0.013),
 ('product_procurement_sla', -0.034),
 ('is_cod', -0.089),
 ('is_mass_market', 0.289),
 ('product_vertical_djcontroller', 0.027),
 ('product_vertical_dock', 0.003),
 ('product_vertical_dockingstation', 0.04),
 ('product_vertical_fmradio', 0.224),
 ('product_vertical_hifisystem', -0.028),
 ('product_vertical_homeaudiospeaker', 0.469),
 ('product_vertical_karaokeplayer', -0.0),
 ('product_vertical_slingbox', 0.002),
 ('product_vertical_soundmixer', 0.021),
 ('product_vertical_voicerecorder', -0.013),
 ('payday_week', 0.0),
 ('holiday_week', 0.0),
 ('Total Investment', 0.066),
 ('Total Investment_SMA_3', -0.02),
 ('Total Investment_SMA_5', -0.06),
 ('Total Investment_EMA_8', 0.048),
 ('Total_Investment_Ad_Stock', 0.023),
 ('TV', -0.045),
 ('TV_SMA_3', -0.012),
 ('TV_SMA_5', -0.077),
 ('TV_EMA_8', 0.011),
 ('TV_Ad_Stock', -0.014),
 ('Digital', -0.018),
 ('Digital_SMA_3', 0.051),
 ('Digital_SMA_5', -0.013),
 ('Digital_EMA_8', 0.01),
 ('Digital_Ad_Stock', -0.031),
 ('Sponsorship', 0.121),
 ('Sponsorship_SMA_3', -0.072),
 ('Sponsorship_SMA_5', -0.04),
 ('Sponsorship_EMA_8', 0.077),
 ('Sponsorship_Ad_Stock', 0.027),
 ('Content Marketing', 0.02),
 ('Content Marketing_SMA_3', -0.137),
 ('Content Marketing_SMA_5', 0.092),
 ('Content Marketing_EMA_8', 0.009),
 ('Content_Marketing_Ad_Stock', -0.055),
 ('Online marketing', -0.015),
 ('Online marketing_SMA_3', -0.036),
 ('Online marketing_SMA_5', -0.057),
 ('Online marketing_EMA_8', 0.025),
 ('Online_marketing_Ad_Stock', 0.029),
 ('Affiliates', -0.035),
 ('Affiliates_SMA_3', -0.011),
 ('Affiliates_SMA_5', -0.026),
 ('Affiliates_EMA_8', 0.027),
 ('Affiliates_Ad_Stock', 0.041),
 ('SEM', -0.041),
 ('SEM_SMA_3', 0.078),
 ('SEM_SMA_5', -0.012),
 ('SEM_EMA_8', 0.041),
 ('SEM_Ad_Stock', -0.01),
 ('Radio', 0.049),
 ('Radio_SMA_3', -0.05),
 ('Radio_SMA_5', -0.036),
 ('Radio_EMA_8', -0.127),
 ('Radio_Ad_Stock', 0.147),
 ('Other', 0.019),
 ('Other_SMA_3', -0.03),
 ('Other_SMA_5', 0.026),
 ('Other_EMA_8', -0.095),
 ('Other_Ad_Stock', 0.12),
 ('NPS', -0.042),
 ('NPS_SMA_3', -0.033),
 ('NPS_SMA_5', -0.087),
 ('Stock Index', 0.002),
 ('Stock Index_SMA_3', 0.027),
 ('Stock Index_SMA_5', -0.085),
 ('Max Temp', -0.062),
 ('Min Temp', -0.008),
 ('Mean Temp', 0.039),
 ('Heat Deg Days', -0.025),
 ('Cool Deg Days', -0.011),
 ('Total Rain (mm)', -0.083),
 ('Total Snow (cm)', -0.009),
 ('Total Precip (mm)', 0.089),
 ('Snow on Grnd (cm)', -0.021),
 ('Sale', 0.011)]
In [300]:
homeaudio_mul_lr_coef_df = pd.DataFrame(homeaudio_mul_lr_coef)
col_rename = {0:'Features',1: 'Coefficients'}
homeaudio_mul_lr_coef_df = homeaudio_mul_lr_coef_df.rename(columns=col_rename)
homeaudio_mul_lr_coef_df = homeaudio_mul_lr_coef_df.iloc[1:,:]
homeaudio_mul_lr_coef_df = homeaudio_mul_lr_coef_df.loc[homeaudio_mul_lr_coef_df['Coefficients']!=0.0]
homeaudio_mul_lr_coef_df = homeaudio_mul_lr_coef_df.sort_values(by=['Coefficients'], ascending = False)
homeaudio_mul_lr_coef_df
Out[300]:
Features Coefficients
13 product_vertical_homeaudiospeaker 0.469
7 is_mass_market 0.289
11 product_vertical_fmradio 0.224
64 Radio_Ad_Stock 0.147
35 Sponsorship 0.121
69 Other_Ad_Stock 0.120
1 Discount% 0.092
42 Content Marketing_SMA_5 0.092
83 Total Precip (mm) 0.089
56 SEM_SMA_3 0.078
38 Sponsorship_EMA_8 0.077
20 Total Investment 0.066
31 Digital_SMA_3 0.051
60 Radio 0.049
23 Total Investment_EMA_8 0.048
58 SEM_EMA_8 0.041
54 Affiliates_Ad_Stock 0.041
10 product_vertical_dockingstation 0.040
78 Mean Temp 0.039
2 deliverybdays 0.036
49 Online_marketing_Ad_Stock 0.029
8 product_vertical_djcontroller 0.027
74 Stock Index_SMA_3 0.027
53 Affiliates_EMA_8 0.027
39 Sponsorship_Ad_Stock 0.027
67 Other_SMA_5 0.026
48 Online marketing_EMA_8 0.025
24 Total_Investment_Ad_Stock 0.023
16 product_vertical_soundmixer 0.021
40 Content Marketing 0.020
65 Other 0.019
85 Sale 0.011
28 TV_EMA_8 0.011
33 Digital_EMA_8 0.010
43 Content Marketing_EMA_8 0.009
9 product_vertical_dock 0.003
73 Stock Index 0.002
15 product_vertical_slingbox 0.002
77 Min Temp -0.008
82 Total Snow (cm) -0.009
59 SEM_Ad_Stock -0.010
80 Cool Deg Days -0.011
51 Affiliates_SMA_3 -0.011
26 TV_SMA_3 -0.012
57 SEM_SMA_5 -0.012
17 product_vertical_voicerecorder -0.013
4 sla -0.013
32 Digital_SMA_5 -0.013
29 TV_Ad_Stock -0.014
45 Online marketing -0.015
30 Digital -0.018
21 Total Investment_SMA_3 -0.020
84 Snow on Grnd (cm) -0.021
79 Heat Deg Days -0.025
52 Affiliates_SMA_5 -0.026
12 product_vertical_hifisystem -0.028
66 Other_SMA_3 -0.030
34 Digital_Ad_Stock -0.031
71 NPS_SMA_3 -0.033
5 product_procurement_sla -0.034
50 Affiliates -0.035
62 Radio_SMA_5 -0.036
46 Online marketing_SMA_3 -0.036
37 Sponsorship_SMA_5 -0.040
55 SEM -0.041
70 NPS -0.042
25 TV -0.045
61 Radio_SMA_3 -0.050
44 Content_Marketing_Ad_Stock -0.055
47 Online marketing_SMA_5 -0.057
22 Total Investment_SMA_5 -0.060
76 Max Temp -0.062
36 Sponsorship_SMA_3 -0.072
27 TV_SMA_5 -0.077
3 deliverycdays -0.079
81 Total Rain (mm) -0.083
75 Stock Index_SMA_5 -0.085
72 NPS_SMA_5 -0.087
6 is_cod -0.089
68 Other_EMA_8 -0.095
63 Radio_EMA_8 -0.127
41 Content Marketing_SMA_3 -0.137

Plotting the Features in descending order of Importance for homeaudio

In [301]:
# Slightly alter the figure size to make it more horizontal.
plt.figure(figsize=(10, 15), dpi=100, facecolor='w', edgecolor='k', frameon='True')
sns.barplot(y='Features', x='Coefficients', palette='husl', data=homeaudio_mul_lr_coef_df, estimator=np.sum)
# Automatically adjust subplot params so that the subplotS fits in to the figure area.
plt.tight_layout()

# display the plot
plt.show()
The 5 most important features affecting GMV(Revenue) for homeaudio are:
Features Coefficients
product_vertical_homeaudiospeaker 0.469
is_mass_market 0.289
product_vertical_fmradio 0.224
Radio_Ad_Stock 0.147
Sponsorship 0.121

Koyck Model

The Additive and Multiplicative Linear Models that we have built till now can capture the current effect of the KPIs. However, if we want to capture the carry-over effect, we would want to model the current revenue figures based on the past figures of the KPIs.

The Koyck tells us that the current revenue generated is not just influenced by the different independent attributes, but also because of the revenue generated over the last periods. ie. Current revenue(Yt) is also dependent on the past revenue values(Yt-1).

Yt = α + β1X1 + β2X2 + β3X3 + β4X4 + β5X5 + ϵ

Yt = α + µYt-1 + β1X1 + β2X2 + β3X3 + β4X4 + β5X5 + ϵ -- (sale at time t is dependent on sale at time t-1)

If X1 is the advertising effect, β1 is the current effect of advertising, carry over effect of advertising is β1 µ/(1-µ).*

Therefore the total effect of advertising = Current effect + Carry over effect

                                        = β1 + β1 * µ/(1-µ) 

                                        = β1/(1-µ)
In [302]:
homeaudio_org_df.head()
Out[302]:
Week gmv Discount% deliverybdays deliverycdays sla product_procurement_sla is_cod is_mass_market product_vertical_djcontroller product_vertical_dock product_vertical_dockingstation product_vertical_fmradio product_vertical_hifisystem product_vertical_homeaudiospeaker product_vertical_karaokeplayer product_vertical_slingbox product_vertical_soundmixer product_vertical_voicerecorder payday_week holiday_week Total Investment Total Investment_SMA_3 Total Investment_SMA_5 Total Investment_EMA_8 Total_Investment_Ad_Stock TV TV_SMA_3 TV_SMA_5 TV_EMA_8 TV_Ad_Stock Digital Digital_SMA_3 Digital_SMA_5 Digital_EMA_8 Digital_Ad_Stock Sponsorship Sponsorship_SMA_3 Sponsorship_SMA_5 Sponsorship_EMA_8 Sponsorship_Ad_Stock Content Marketing Content Marketing_SMA_3 Content Marketing_SMA_5 Content Marketing_EMA_8 Content_Marketing_Ad_Stock Online marketing Online marketing_SMA_3 Online marketing_SMA_5 Online marketing_EMA_8 Online_marketing_Ad_Stock Affiliates Affiliates_SMA_3 Affiliates_SMA_5 Affiliates_EMA_8 Affiliates_Ad_Stock SEM SEM_SMA_3 SEM_SMA_5 SEM_EMA_8 SEM_Ad_Stock Radio Radio_SMA_3 Radio_SMA_5 Radio_EMA_8 Radio_Ad_Stock Other Other_SMA_3 Other_SMA_5 Other_EMA_8 Other_Ad_Stock NPS NPS_SMA_3 NPS_SMA_5 Stock Index Stock Index_SMA_3 Stock Index_SMA_5 Max Temp Min Temp Mean Temp Heat Deg Days Cool Deg Days Total Rain (mm) Total Snow (cm) Total Precip (mm) Snow on Grnd (cm) Sale
25 28 4573783.133 31.451 0.000 0.000 7.369 2.863 1583 1366 8 33 1 516.000 23 1374.000 0 0 0 63 0 0 4.265 0.000 0.000 4.265 4.265 0.054 0.000 0.000 0.054 0.054 0.633 0.000 0.000 0.633 0.633 1.854 0.000 0.000 1.854 1.854 0.000 0.000 0.000 0.000 0.000 0.332 0.000 0.000 0.332 0.332 0.137 0.000 0.000 0.137 0.137 1.256 0.000 0.000 1.256 1.256 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 0.000 0.000 1177.000 0.000 0.000 28.000 12.500 20.100 0.283 2.383 4.417 0.000 4.417 0.000 0
26 29 5371525.000 32.967 0.000 0.000 6.985 2.746 1868 1610 7 50 1 574.000 42 1623.000 0 0 0 69 1 0 4.265 0.000 0.000 4.265 6.824 0.054 0.000 0.000 0.054 0.086 0.633 0.000 0.000 0.633 1.013 1.854 0.000 0.000 1.854 2.966 0.000 0.000 0.000 0.000 0.000 0.332 0.000 0.000 0.332 0.531 0.137 0.000 0.000 0.137 0.219 1.256 0.000 0.000 1.256 2.010 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 0.000 0.000 1177.000 0.000 0.000 33.000 11.000 23.183 0.000 5.183 1.400 0.000 1.400 0.000 2
27 30 4679828.000 32.357 0.000 0.000 7.072 2.861 1758 1569 4 56 0 577.000 36 1430.000 0 0 0 46 0 0 4.265 4.265 0.000 4.265 8.359 0.054 0.054 0.000 0.054 0.106 0.633 0.633 0.000 0.633 1.241 1.854 1.854 0.000 1.854 3.634 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.000 0.332 0.651 0.137 0.137 0.000 0.137 0.269 1.256 1.256 0.000 1.256 2.462 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 54.600 0.000 1177.000 1177.000 0.000 31.500 14.500 23.060 0.000 5.060 1.080 0.000 1.080 0.000 0
28 31 3451151.000 32.208 0.000 0.000 7.201 2.735 1244 1072 2 43 0 420.000 20 1025.000 0 0 0 44 1 0 4.265 4.265 0.000 4.265 9.281 0.054 0.054 0.000 0.054 0.118 0.633 0.633 0.000 0.633 1.377 1.854 1.854 0.000 1.854 4.034 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.000 0.332 0.722 0.137 0.137 0.000 0.137 0.298 1.256 1.256 0.000 1.256 2.733 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 54.600 0.000 1177.000 1177.000 0.000 33.500 16.000 24.567 0.000 6.567 4.633 0.000 4.633 0.000 0
29 32 2599.000 16.130 0.000 0.000 9.000 2.000 0 0 0 0 0 0.000 0 1.000 0 0 0 0 0 0 1.013 3.181 3.615 3.542 6.581 0.001 0.036 0.043 0.042 0.072 0.256 0.507 0.558 0.549 1.082 0.213 1.307 1.526 1.489 2.634 0.000 0.000 0.000 0.000 0.000 0.026 0.230 0.271 0.264 0.459 0.015 0.096 0.113 0.110 0.194 0.503 1.005 1.105 1.089 2.143 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 59.987 56.395 55.677 1206.000 1186.667 1182.800 28.500 15.000 21.650 0.000 3.650 0.350 0.000 0.350 0.000 0
In [303]:
# Making copy of dataframes from the original ones
cameraaccessory_koy_df = cameraaccessory_org_df.copy()
gamingaccessory_koy_df = gamingaccessory_org_df.copy()
homeaudio_koy_df = homeaudio_org_df.copy()
homeaudio_koy_df.head()
Out[303]:
Week gmv Discount% deliverybdays deliverycdays sla product_procurement_sla is_cod is_mass_market product_vertical_djcontroller product_vertical_dock product_vertical_dockingstation product_vertical_fmradio product_vertical_hifisystem product_vertical_homeaudiospeaker product_vertical_karaokeplayer product_vertical_slingbox product_vertical_soundmixer product_vertical_voicerecorder payday_week holiday_week Total Investment Total Investment_SMA_3 Total Investment_SMA_5 Total Investment_EMA_8 Total_Investment_Ad_Stock TV TV_SMA_3 TV_SMA_5 TV_EMA_8 TV_Ad_Stock Digital Digital_SMA_3 Digital_SMA_5 Digital_EMA_8 Digital_Ad_Stock Sponsorship Sponsorship_SMA_3 Sponsorship_SMA_5 Sponsorship_EMA_8 Sponsorship_Ad_Stock Content Marketing Content Marketing_SMA_3 Content Marketing_SMA_5 Content Marketing_EMA_8 Content_Marketing_Ad_Stock Online marketing Online marketing_SMA_3 Online marketing_SMA_5 Online marketing_EMA_8 Online_marketing_Ad_Stock Affiliates Affiliates_SMA_3 Affiliates_SMA_5 Affiliates_EMA_8 Affiliates_Ad_Stock SEM SEM_SMA_3 SEM_SMA_5 SEM_EMA_8 SEM_Ad_Stock Radio Radio_SMA_3 Radio_SMA_5 Radio_EMA_8 Radio_Ad_Stock Other Other_SMA_3 Other_SMA_5 Other_EMA_8 Other_Ad_Stock NPS NPS_SMA_3 NPS_SMA_5 Stock Index Stock Index_SMA_3 Stock Index_SMA_5 Max Temp Min Temp Mean Temp Heat Deg Days Cool Deg Days Total Rain (mm) Total Snow (cm) Total Precip (mm) Snow on Grnd (cm) Sale
25 28 4573783.133 31.451 0.000 0.000 7.369 2.863 1583 1366 8 33 1 516.000 23 1374.000 0 0 0 63 0 0 4.265 0.000 0.000 4.265 4.265 0.054 0.000 0.000 0.054 0.054 0.633 0.000 0.000 0.633 0.633 1.854 0.000 0.000 1.854 1.854 0.000 0.000 0.000 0.000 0.000 0.332 0.000 0.000 0.332 0.332 0.137 0.000 0.000 0.137 0.137 1.256 0.000 0.000 1.256 1.256 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 0.000 0.000 1177.000 0.000 0.000 28.000 12.500 20.100 0.283 2.383 4.417 0.000 4.417 0.000 0
26 29 5371525.000 32.967 0.000 0.000 6.985 2.746 1868 1610 7 50 1 574.000 42 1623.000 0 0 0 69 1 0 4.265 0.000 0.000 4.265 6.824 0.054 0.000 0.000 0.054 0.086 0.633 0.000 0.000 0.633 1.013 1.854 0.000 0.000 1.854 2.966 0.000 0.000 0.000 0.000 0.000 0.332 0.000 0.000 0.332 0.531 0.137 0.000 0.000 0.137 0.219 1.256 0.000 0.000 1.256 2.010 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 0.000 0.000 1177.000 0.000 0.000 33.000 11.000 23.183 0.000 5.183 1.400 0.000 1.400 0.000 2
27 30 4679828.000 32.357 0.000 0.000 7.072 2.861 1758 1569 4 56 0 577.000 36 1430.000 0 0 0 46 0 0 4.265 4.265 0.000 4.265 8.359 0.054 0.054 0.000 0.054 0.106 0.633 0.633 0.000 0.633 1.241 1.854 1.854 0.000 1.854 3.634 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.000 0.332 0.651 0.137 0.137 0.000 0.137 0.269 1.256 1.256 0.000 1.256 2.462 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 54.600 0.000 1177.000 1177.000 0.000 31.500 14.500 23.060 0.000 5.060 1.080 0.000 1.080 0.000 0
28 31 3451151.000 32.208 0.000 0.000 7.201 2.735 1244 1072 2 43 0 420.000 20 1025.000 0 0 0 44 1 0 4.265 4.265 0.000 4.265 9.281 0.054 0.054 0.000 0.054 0.118 0.633 0.633 0.000 0.633 1.377 1.854 1.854 0.000 1.854 4.034 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.000 0.332 0.722 0.137 0.137 0.000 0.137 0.298 1.256 1.256 0.000 1.256 2.733 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 54.600 0.000 1177.000 1177.000 0.000 33.500 16.000 24.567 0.000 6.567 4.633 0.000 4.633 0.000 0
29 32 2599.000 16.130 0.000 0.000 9.000 2.000 0 0 0 0 0 0.000 0 1.000 0 0 0 0 0 0 1.013 3.181 3.615 3.542 6.581 0.001 0.036 0.043 0.042 0.072 0.256 0.507 0.558 0.549 1.082 0.213 1.307 1.526 1.489 2.634 0.000 0.000 0.000 0.000 0.000 0.026 0.230 0.271 0.264 0.459 0.015 0.096 0.113 0.110 0.194 0.503 1.005 1.105 1.089 2.143 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 59.987 56.395 55.677 1206.000 1186.667 1182.800 28.500 15.000 21.650 0.000 3.650 0.350 0.000 0.350 0.000 0
In [304]:
# Checking for total count and percentage of null values in all columns of the dataframe.

total = pd.DataFrame(homeaudio_koy_df.isnull().sum().sort_values(ascending=False), columns=['Total'])
percentage = pd.DataFrame(round(100*(homeaudio_koy_df.isnull().sum()/homeaudio_koy_df.shape[0]),2).sort_values(ascending=False)\
                          ,columns=['Percentage'])

pd.concat([total, percentage], axis = 1).head()
Out[304]:
Total Percentage
Sale 0 0.000
Digital 0 0.000
Total Investment_SMA_5 0 0.000
Total Investment_EMA_8 0 0.000
Total_Investment_Ad_Stock 0 0.000

Creating a new variable which is the Lag of 1 week of the dependent variable (GMV)

In [305]:
GMV_Lag = ['gmv']
In [306]:
def lag_variables(df,var,n):
    for i in var:
        loc_index = df.columns.get_loc(i) + 1
        df.insert(loc=loc_index,column= i+'_lag'+np.str(n),value=df[i].shift(n))
    return df
In [307]:
cameraaccessory_koy_df = lag_variables(cameraaccessory_koy_df,GMV_Lag,1) 
gamingaccessory_koy_df = lag_variables(gamingaccessory_koy_df,GMV_Lag,1) 
homeaudio_koy_df = lag_variables(homeaudio_koy_df,GMV_Lag,1) 
homeaudio_koy_df.head()
Out[307]:
Week gmv gmv_lag1 Discount% deliverybdays deliverycdays sla product_procurement_sla is_cod is_mass_market product_vertical_djcontroller product_vertical_dock product_vertical_dockingstation product_vertical_fmradio product_vertical_hifisystem product_vertical_homeaudiospeaker product_vertical_karaokeplayer product_vertical_slingbox product_vertical_soundmixer product_vertical_voicerecorder payday_week holiday_week Total Investment Total Investment_SMA_3 Total Investment_SMA_5 Total Investment_EMA_8 Total_Investment_Ad_Stock TV TV_SMA_3 TV_SMA_5 TV_EMA_8 TV_Ad_Stock Digital Digital_SMA_3 Digital_SMA_5 Digital_EMA_8 Digital_Ad_Stock Sponsorship Sponsorship_SMA_3 Sponsorship_SMA_5 Sponsorship_EMA_8 Sponsorship_Ad_Stock Content Marketing Content Marketing_SMA_3 Content Marketing_SMA_5 Content Marketing_EMA_8 Content_Marketing_Ad_Stock Online marketing Online marketing_SMA_3 Online marketing_SMA_5 Online marketing_EMA_8 Online_marketing_Ad_Stock Affiliates Affiliates_SMA_3 Affiliates_SMA_5 Affiliates_EMA_8 Affiliates_Ad_Stock SEM SEM_SMA_3 SEM_SMA_5 SEM_EMA_8 SEM_Ad_Stock Radio Radio_SMA_3 Radio_SMA_5 Radio_EMA_8 Radio_Ad_Stock Other Other_SMA_3 Other_SMA_5 Other_EMA_8 Other_Ad_Stock NPS NPS_SMA_3 NPS_SMA_5 Stock Index Stock Index_SMA_3 Stock Index_SMA_5 Max Temp Min Temp Mean Temp Heat Deg Days Cool Deg Days Total Rain (mm) Total Snow (cm) Total Precip (mm) Snow on Grnd (cm) Sale
25 28 4573783.133 nan 31.451 0.000 0.000 7.369 2.863 1583 1366 8 33 1 516.000 23 1374.000 0 0 0 63 0 0 4.265 0.000 0.000 4.265 4.265 0.054 0.000 0.000 0.054 0.054 0.633 0.000 0.000 0.633 0.633 1.854 0.000 0.000 1.854 1.854 0.000 0.000 0.000 0.000 0.000 0.332 0.000 0.000 0.332 0.332 0.137 0.000 0.000 0.137 0.137 1.256 0.000 0.000 1.256 1.256 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 0.000 0.000 1177.000 0.000 0.000 28.000 12.500 20.100 0.283 2.383 4.417 0.000 4.417 0.000 0
26 29 5371525.000 4573783.133 32.967 0.000 0.000 6.985 2.746 1868 1610 7 50 1 574.000 42 1623.000 0 0 0 69 1 0 4.265 0.000 0.000 4.265 6.824 0.054 0.000 0.000 0.054 0.086 0.633 0.000 0.000 0.633 1.013 1.854 0.000 0.000 1.854 2.966 0.000 0.000 0.000 0.000 0.000 0.332 0.000 0.000 0.332 0.531 0.137 0.000 0.000 0.137 0.219 1.256 0.000 0.000 1.256 2.010 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 0.000 0.000 1177.000 0.000 0.000 33.000 11.000 23.183 0.000 5.183 1.400 0.000 1.400 0.000 2
27 30 4679828.000 5371525.000 32.357 0.000 0.000 7.072 2.861 1758 1569 4 56 0 577.000 36 1430.000 0 0 0 46 0 0 4.265 4.265 0.000 4.265 8.359 0.054 0.054 0.000 0.054 0.106 0.633 0.633 0.000 0.633 1.241 1.854 1.854 0.000 1.854 3.634 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.000 0.332 0.651 0.137 0.137 0.000 0.137 0.269 1.256 1.256 0.000 1.256 2.462 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 54.600 0.000 1177.000 1177.000 0.000 31.500 14.500 23.060 0.000 5.060 1.080 0.000 1.080 0.000 0
28 31 3451151.000 4679828.000 32.208 0.000 0.000 7.201 2.735 1244 1072 2 43 0 420.000 20 1025.000 0 0 0 44 1 0 4.265 4.265 0.000 4.265 9.281 0.054 0.054 0.000 0.054 0.118 0.633 0.633 0.000 0.633 1.377 1.854 1.854 0.000 1.854 4.034 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.000 0.332 0.722 0.137 0.137 0.000 0.137 0.298 1.256 1.256 0.000 1.256 2.733 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 54.600 0.000 1177.000 1177.000 0.000 33.500 16.000 24.567 0.000 6.567 4.633 0.000 4.633 0.000 0
29 32 2599.000 3451151.000 16.130 0.000 0.000 9.000 2.000 0 0 0 0 0 0.000 0 1.000 0 0 0 0 0 0 1.013 3.181 3.615 3.542 6.581 0.001 0.036 0.043 0.042 0.072 0.256 0.507 0.558 0.549 1.082 0.213 1.307 1.526 1.489 2.634 0.000 0.000 0.000 0.000 0.000 0.026 0.230 0.271 0.264 0.459 0.015 0.096 0.113 0.110 0.194 0.503 1.005 1.105 1.089 2.143 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 59.987 56.395 55.677 1206.000 1186.667 1182.800 28.500 15.000 21.650 0.000 3.650 0.350 0.000 0.350 0.000 0
In [308]:
# Imputing all null values with 0
cameraaccessory_koy_df.fillna(value=0, inplace=True)
gamingaccessory_koy_df.fillna(value=0, inplace=True)
homeaudio_koy_df.fillna(value=0, inplace=True)
homeaudio_koy_df.head(10)
Out[308]:
Week gmv gmv_lag1 Discount% deliverybdays deliverycdays sla product_procurement_sla is_cod is_mass_market product_vertical_djcontroller product_vertical_dock product_vertical_dockingstation product_vertical_fmradio product_vertical_hifisystem product_vertical_homeaudiospeaker product_vertical_karaokeplayer product_vertical_slingbox product_vertical_soundmixer product_vertical_voicerecorder payday_week holiday_week Total Investment Total Investment_SMA_3 Total Investment_SMA_5 Total Investment_EMA_8 Total_Investment_Ad_Stock TV TV_SMA_3 TV_SMA_5 TV_EMA_8 TV_Ad_Stock Digital Digital_SMA_3 Digital_SMA_5 Digital_EMA_8 Digital_Ad_Stock Sponsorship Sponsorship_SMA_3 Sponsorship_SMA_5 Sponsorship_EMA_8 Sponsorship_Ad_Stock Content Marketing Content Marketing_SMA_3 Content Marketing_SMA_5 Content Marketing_EMA_8 Content_Marketing_Ad_Stock Online marketing Online marketing_SMA_3 Online marketing_SMA_5 Online marketing_EMA_8 Online_marketing_Ad_Stock Affiliates Affiliates_SMA_3 Affiliates_SMA_5 Affiliates_EMA_8 Affiliates_Ad_Stock SEM SEM_SMA_3 SEM_SMA_5 SEM_EMA_8 SEM_Ad_Stock Radio Radio_SMA_3 Radio_SMA_5 Radio_EMA_8 Radio_Ad_Stock Other Other_SMA_3 Other_SMA_5 Other_EMA_8 Other_Ad_Stock NPS NPS_SMA_3 NPS_SMA_5 Stock Index Stock Index_SMA_3 Stock Index_SMA_5 Max Temp Min Temp Mean Temp Heat Deg Days Cool Deg Days Total Rain (mm) Total Snow (cm) Total Precip (mm) Snow on Grnd (cm) Sale
25 28 4573783.133 0.000 31.451 0.000 0.000 7.369 2.863 1583 1366 8 33 1 516.000 23 1374.000 0 0 0 63 0 0 4.265 0.000 0.000 4.265 4.265 0.054 0.000 0.000 0.054 0.054 0.633 0.000 0.000 0.633 0.633 1.854 0.000 0.000 1.854 1.854 0.000 0.000 0.000 0.000 0.000 0.332 0.000 0.000 0.332 0.332 0.137 0.000 0.000 0.137 0.137 1.256 0.000 0.000 1.256 1.256 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 0.000 0.000 1177.000 0.000 0.000 28.000 12.500 20.100 0.283 2.383 4.417 0.000 4.417 0.000 0
26 29 5371525.000 4573783.133 32.967 0.000 0.000 6.985 2.746 1868 1610 7 50 1 574.000 42 1623.000 0 0 0 69 1 0 4.265 0.000 0.000 4.265 6.824 0.054 0.000 0.000 0.054 0.086 0.633 0.000 0.000 0.633 1.013 1.854 0.000 0.000 1.854 2.966 0.000 0.000 0.000 0.000 0.000 0.332 0.000 0.000 0.332 0.531 0.137 0.000 0.000 0.137 0.219 1.256 0.000 0.000 1.256 2.010 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 0.000 0.000 1177.000 0.000 0.000 33.000 11.000 23.183 0.000 5.183 1.400 0.000 1.400 0.000 2
27 30 4679828.000 5371525.000 32.357 0.000 0.000 7.072 2.861 1758 1569 4 56 0 577.000 36 1430.000 0 0 0 46 0 0 4.265 4.265 0.000 4.265 8.359 0.054 0.054 0.000 0.054 0.106 0.633 0.633 0.000 0.633 1.241 1.854 1.854 0.000 1.854 3.634 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.000 0.332 0.651 0.137 0.137 0.000 0.137 0.269 1.256 1.256 0.000 1.256 2.462 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 54.600 0.000 1177.000 1177.000 0.000 31.500 14.500 23.060 0.000 5.060 1.080 0.000 1.080 0.000 0
28 31 3451151.000 4679828.000 32.208 0.000 0.000 7.201 2.735 1244 1072 2 43 0 420.000 20 1025.000 0 0 0 44 1 0 4.265 4.265 0.000 4.265 9.281 0.054 0.054 0.000 0.054 0.118 0.633 0.633 0.000 0.633 1.377 1.854 1.854 0.000 1.854 4.034 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.000 0.332 0.722 0.137 0.137 0.000 0.137 0.298 1.256 1.256 0.000 1.256 2.733 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 54.600 0.000 1177.000 1177.000 0.000 33.500 16.000 24.567 0.000 6.567 4.633 0.000 4.633 0.000 0
29 32 2599.000 3451151.000 16.130 0.000 0.000 9.000 2.000 0 0 0 0 0 0.000 0 1.000 0 0 0 0 0 0 1.013 3.181 3.615 3.542 6.581 0.001 0.036 0.043 0.042 0.072 0.256 0.507 0.558 0.549 1.082 0.213 1.307 1.526 1.489 2.634 0.000 0.000 0.000 0.000 0.000 0.026 0.230 0.271 0.264 0.459 0.015 0.096 0.113 0.110 0.194 0.503 1.005 1.105 1.089 2.143 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 59.987 56.395 55.677 1206.000 1186.667 1182.800 28.500 15.000 21.650 0.000 3.650 0.350 0.000 0.350 0.000 0
30 36 3875305.000 2599.000 35.972 0.000 0.000 5.599 2.790 1427 1326 5 48 1 525.000 36 1108.000 0 0 0 66 1 0 1.013 1.013 1.013 1.939 3.057 0.001 0.001 0.001 0.016 0.011 0.256 0.256 0.256 0.363 0.697 0.213 0.213 0.213 0.680 0.805 0.000 0.000 0.000 0.000 0.000 0.026 0.026 0.026 0.113 0.116 0.015 0.015 0.015 0.050 0.058 0.503 0.503 0.503 0.717 1.372 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 59.987 59.987 59.987 1206.000 1206.000 1206.000 32.000 17.500 24.460 0.000 6.460 12.120 0.000 12.120 0.000 0
31 37 4190321.000 3875305.000 35.737 0.000 0.000 5.577 2.897 1655 1508 7 53 3 609.000 35 1215.000 0 0 0 81 0 0 24.064 8.697 5.623 6.855 25.898 0.970 0.324 0.195 0.228 0.977 0.339 0.284 0.273 0.358 0.757 15.697 5.374 3.310 4.017 16.180 0.153 0.051 0.031 0.034 0.153 4.095 1.382 0.840 0.998 4.165 1.260 0.430 0.264 0.319 1.295 1.551 0.852 0.713 0.903 2.374 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 46.925 55.633 57.375 1101.000 1171.000 1185.000 32.500 9.000 19.240 1.280 2.520 0.960 0.000 0.960 0.000 0
32 38 3740780.000 4190321.000 35.231 0.000 0.000 6.246 2.696 1459 1336 7 52 4 532.000 32 1110.000 0 0 0 45 1 0 24.064 16.380 10.233 10.680 39.603 0.970 0.647 0.389 0.393 1.556 0.339 0.311 0.289 0.354 0.793 15.697 10.536 6.407 6.613 25.405 0.153 0.102 0.061 0.060 0.245 4.095 2.739 1.654 1.686 6.594 1.260 0.845 0.513 0.528 2.037 1.551 1.202 0.922 1.047 2.976 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 46.925 51.279 54.762 1101.000 1136.000 1164.000 27.500 13.000 20.550 0.000 2.550 1.100 0.000 1.100 0.000 0
33 39 4212446.000 3740780.000 34.340 0.000 0.000 6.332 2.673 1745 1686 6 52 3 716.000 30 1276.000 0 0 0 69 0 0 24.064 24.064 14.844 13.654 47.826 0.970 0.970 0.582 0.521 1.904 0.339 0.339 0.306 0.350 0.815 15.697 15.697 9.503 8.631 30.940 0.153 0.153 0.092 0.081 0.300 4.095 4.095 2.467 2.221 8.051 1.260 1.260 0.762 0.691 2.482 1.551 1.551 1.132 1.159 3.336 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 46.925 46.925 52.150 1101.000 1101.000 1143.000 25.500 14.500 20.000 0.000 2.000 0.000 0.000 0.000 0.000 0
34 40 4149262.000 4212446.000 33.715 0.004 0.005 6.394 2.515 1653 1519 4 52 1 644.000 30 1262.000 0 0 0 58 1 0 24.064 24.064 19.454 15.967 52.759 0.970 0.970 0.776 0.621 2.112 0.339 0.339 0.322 0.348 0.828 15.697 15.697 12.600 10.202 34.261 0.153 0.153 0.122 0.097 0.333 4.095 4.095 3.281 2.638 8.926 1.260 1.260 1.011 0.817 2.749 1.551 1.551 1.341 1.246 3.553 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 46.925 46.925 49.538 1101.000 1101.000 1122.000 26.500 8.000 17.725 1.925 1.650 2.450 0.000 2.450 0.000 0

We will drop the Week column as it is a row identifier and will not help in prediction of revenue

In [309]:
# removing columns
cameraaccessory_koy_df = cameraaccessory_koy_df.drop('Week', axis=1)
gamingaccessory_koy_df = gamingaccessory_koy_df.drop('Week', axis=1)
homeaudio_koy_df = homeaudio_koy_df.drop('Week', axis=1)
homeaudio_koy_df.head()
Out[309]:
gmv gmv_lag1 Discount% deliverybdays deliverycdays sla product_procurement_sla is_cod is_mass_market product_vertical_djcontroller product_vertical_dock product_vertical_dockingstation product_vertical_fmradio product_vertical_hifisystem product_vertical_homeaudiospeaker product_vertical_karaokeplayer product_vertical_slingbox product_vertical_soundmixer product_vertical_voicerecorder payday_week holiday_week Total Investment Total Investment_SMA_3 Total Investment_SMA_5 Total Investment_EMA_8 Total_Investment_Ad_Stock TV TV_SMA_3 TV_SMA_5 TV_EMA_8 TV_Ad_Stock Digital Digital_SMA_3 Digital_SMA_5 Digital_EMA_8 Digital_Ad_Stock Sponsorship Sponsorship_SMA_3 Sponsorship_SMA_5 Sponsorship_EMA_8 Sponsorship_Ad_Stock Content Marketing Content Marketing_SMA_3 Content Marketing_SMA_5 Content Marketing_EMA_8 Content_Marketing_Ad_Stock Online marketing Online marketing_SMA_3 Online marketing_SMA_5 Online marketing_EMA_8 Online_marketing_Ad_Stock Affiliates Affiliates_SMA_3 Affiliates_SMA_5 Affiliates_EMA_8 Affiliates_Ad_Stock SEM SEM_SMA_3 SEM_SMA_5 SEM_EMA_8 SEM_Ad_Stock Radio Radio_SMA_3 Radio_SMA_5 Radio_EMA_8 Radio_Ad_Stock Other Other_SMA_3 Other_SMA_5 Other_EMA_8 Other_Ad_Stock NPS NPS_SMA_3 NPS_SMA_5 Stock Index Stock Index_SMA_3 Stock Index_SMA_5 Max Temp Min Temp Mean Temp Heat Deg Days Cool Deg Days Total Rain (mm) Total Snow (cm) Total Precip (mm) Snow on Grnd (cm) Sale
25 4573783.133 0.000 31.451 0.000 0.000 7.369 2.863 1583 1366 8 33 1 516.000 23 1374.000 0 0 0 63 0 0 4.265 0.000 0.000 4.265 4.265 0.054 0.000 0.000 0.054 0.054 0.633 0.000 0.000 0.633 0.633 1.854 0.000 0.000 1.854 1.854 0.000 0.000 0.000 0.000 0.000 0.332 0.000 0.000 0.332 0.332 0.137 0.000 0.000 0.137 0.137 1.256 0.000 0.000 1.256 1.256 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 0.000 0.000 1177.000 0.000 0.000 28.000 12.500 20.100 0.283 2.383 4.417 0.000 4.417 0.000 0
26 5371525.000 4573783.133 32.967 0.000 0.000 6.985 2.746 1868 1610 7 50 1 574.000 42 1623.000 0 0 0 69 1 0 4.265 0.000 0.000 4.265 6.824 0.054 0.000 0.000 0.054 0.086 0.633 0.000 0.000 0.633 1.013 1.854 0.000 0.000 1.854 2.966 0.000 0.000 0.000 0.000 0.000 0.332 0.000 0.000 0.332 0.531 0.137 0.000 0.000 0.137 0.219 1.256 0.000 0.000 1.256 2.010 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 0.000 0.000 1177.000 0.000 0.000 33.000 11.000 23.183 0.000 5.183 1.400 0.000 1.400 0.000 2
27 4679828.000 5371525.000 32.357 0.000 0.000 7.072 2.861 1758 1569 4 56 0 577.000 36 1430.000 0 0 0 46 0 0 4.265 4.265 0.000 4.265 8.359 0.054 0.054 0.000 0.054 0.106 0.633 0.633 0.000 0.633 1.241 1.854 1.854 0.000 1.854 3.634 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.000 0.332 0.651 0.137 0.137 0.000 0.137 0.269 1.256 1.256 0.000 1.256 2.462 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 54.600 0.000 1177.000 1177.000 0.000 31.500 14.500 23.060 0.000 5.060 1.080 0.000 1.080 0.000 0
28 3451151.000 4679828.000 32.208 0.000 0.000 7.201 2.735 1244 1072 2 43 0 420.000 20 1025.000 0 0 0 44 1 0 4.265 4.265 0.000 4.265 9.281 0.054 0.054 0.000 0.054 0.118 0.633 0.633 0.000 0.633 1.377 1.854 1.854 0.000 1.854 4.034 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.000 0.332 0.722 0.137 0.137 0.000 0.137 0.298 1.256 1.256 0.000 1.256 2.733 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 54.600 0.000 1177.000 1177.000 0.000 33.500 16.000 24.567 0.000 6.567 4.633 0.000 4.633 0.000 0
29 2599.000 3451151.000 16.130 0.000 0.000 9.000 2.000 0 0 0 0 0 0.000 0 1.000 0 0 0 0 0 0 1.013 3.181 3.615 3.542 6.581 0.001 0.036 0.043 0.042 0.072 0.256 0.507 0.558 0.549 1.082 0.213 1.307 1.526 1.489 2.634 0.000 0.000 0.000 0.000 0.000 0.026 0.230 0.271 0.264 0.459 0.015 0.096 0.113 0.110 0.194 0.503 1.005 1.105 1.089 2.143 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 59.987 56.395 55.677 1206.000 1186.667 1182.800 28.500 15.000 21.650 0.000 3.650 0.350 0.000 0.350 0.000 0
In [310]:
# Checking for total count and percentage of null values in all columns of the dataframe.

total = pd.DataFrame(homeaudio_koy_df.isnull().sum().sort_values(ascending=False), columns=['Total'])
percentage = pd.DataFrame(round(100*(homeaudio_koy_df.isnull().sum()/homeaudio_koy_df.shape[0]),2).sort_values(ascending=False)\
                          ,columns=['Percentage'])

pd.concat([total, percentage], axis = 1).head()
Out[310]:
Total Percentage
Sale 0 0.000
Digital 0 0.000
Total Investment_SMA_5 0 0.000
Total Investment_EMA_8 0 0.000
Total_Investment_Ad_Stock 0 0.000

Rescaling the Features of the 3 Dataframes

We will use Standard scaling.

In [311]:
from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()

cameraaccessory_koy_df[cameraaccessory_koy_df.columns]=scaler.fit_transform(cameraaccessory_koy_df[cameraaccessory_koy_df.columns])
gamingaccessory_koy_df[gamingaccessory_koy_df.columns]=scaler.fit_transform(gamingaccessory_koy_df[gamingaccessory_koy_df.columns])
homeaudio_koy_df[homeaudio_koy_df.columns]=scaler.fit_transform(homeaudio_koy_df[homeaudio_koy_df.columns])

homeaudio_koy_df.head()
Out[311]:
gmv gmv_lag1 Discount% deliverybdays deliverycdays sla product_procurement_sla is_cod is_mass_market product_vertical_djcontroller product_vertical_dock product_vertical_dockingstation product_vertical_fmradio product_vertical_hifisystem product_vertical_homeaudiospeaker product_vertical_karaokeplayer product_vertical_slingbox product_vertical_soundmixer product_vertical_voicerecorder payday_week holiday_week Total Investment Total Investment_SMA_3 Total Investment_SMA_5 Total Investment_EMA_8 Total_Investment_Ad_Stock TV TV_SMA_3 TV_SMA_5 TV_EMA_8 TV_Ad_Stock Digital Digital_SMA_3 Digital_SMA_5 Digital_EMA_8 Digital_Ad_Stock Sponsorship Sponsorship_SMA_3 Sponsorship_SMA_5 Sponsorship_EMA_8 Sponsorship_Ad_Stock Content Marketing Content Marketing_SMA_3 Content Marketing_SMA_5 Content Marketing_EMA_8 Content_Marketing_Ad_Stock Online marketing Online marketing_SMA_3 Online marketing_SMA_5 Online marketing_EMA_8 Online_marketing_Ad_Stock Affiliates Affiliates_SMA_3 Affiliates_SMA_5 Affiliates_EMA_8 Affiliates_Ad_Stock SEM SEM_SMA_3 SEM_SMA_5 SEM_EMA_8 SEM_Ad_Stock Radio Radio_SMA_3 Radio_SMA_5 Radio_EMA_8 Radio_Ad_Stock Other Other_SMA_3 Other_SMA_5 Other_EMA_8 Other_Ad_Stock NPS NPS_SMA_3 NPS_SMA_5 Stock Index Stock Index_SMA_3 Stock Index_SMA_5 Max Temp Min Temp Mean Temp Heat Deg Days Cool Deg Days Total Rain (mm) Total Snow (cm) Total Precip (mm) Snow on Grnd (cm) Sale
25 -0.107 -1.832 -0.920 -0.639 -0.639 1.882 1.297 -0.132 -0.178 1.670 -0.198 -1.132 0.062 -0.420 -0.206 -0.146 -0.197 -0.542 -0.233 -1.043 -0.545 -1.229 -1.739 -1.806 -1.814 -1.789 -1.331 -1.512 -1.593 -1.803 -1.633 0.042 -0.831 -0.893 0.061 -0.563 -0.922 -1.326 -1.378 -1.404 -1.357 -0.718 -0.821 -0.946 -1.204 -0.952 -2.079 -2.266 -2.221 -2.172 -2.283 -2.144 -2.346 -2.246 -2.136 -2.313 -0.308 -1.054 -1.129 -0.519 -0.896 -0.476 -0.584 -0.677 -0.845 -0.686 -0.456 -0.555 -0.652 -0.814 -0.657 1.550 -4.560 -3.235 0.157 -4.601 -3.265 0.776 1.118 0.959 -1.035 0.461 0.430 -0.368 0.324 -0.275 -0.435
26 0.203 -0.092 -0.594 -0.639 -0.639 1.437 0.560 0.179 0.130 1.267 0.481 -1.132 0.374 0.967 0.079 -0.146 -0.197 -0.542 -0.021 0.959 -0.545 -1.229 -1.739 -1.806 -1.814 -1.667 -1.331 -1.512 -1.593 -1.803 -1.608 0.042 -0.831 -0.893 0.061 -0.303 -0.922 -1.326 -1.378 -1.404 -1.265 -0.718 -0.821 -0.946 -1.204 -0.952 -2.079 -2.266 -2.221 -2.172 -2.234 -2.144 -2.346 -2.246 -2.136 -2.248 -0.308 -1.054 -1.129 -0.519 -0.687 -0.476 -0.584 -0.677 -0.845 -0.686 -0.456 -0.555 -0.652 -0.814 -0.657 1.550 -4.560 -3.235 0.157 -4.601 -3.265 1.308 0.947 1.314 -1.075 1.755 -0.424 -0.368 -0.472 -0.275 0.870
27 -0.066 0.212 -0.725 -0.639 -0.639 1.538 1.280 0.059 0.078 0.059 0.721 -1.310 0.390 0.529 -0.142 -0.146 -0.197 -0.542 -0.835 -1.043 -0.545 -1.229 -1.299 -1.806 -1.814 -1.594 -1.331 -1.420 -1.593 -1.803 -1.593 0.042 0.082 -0.893 0.061 -0.147 -0.922 -0.994 -1.378 -1.404 -1.210 -0.718 -0.821 -0.946 -1.204 -0.952 -2.079 -2.071 -2.221 -2.172 -2.205 -2.144 -2.083 -2.246 -2.136 -2.208 -0.308 -0.317 -1.129 -0.519 -0.563 -0.476 -0.584 -0.677 -0.845 -0.686 -0.456 -0.555 -0.652 -0.814 -0.657 1.550 0.743 -3.235 0.157 0.255 -3.265 1.149 1.347 1.300 -1.075 1.698 -0.515 -0.368 -0.557 -0.275 -0.435
28 -0.543 -0.052 -0.757 -0.639 -0.639 1.687 0.487 -0.503 -0.548 -0.747 0.202 -1.310 -0.455 -0.639 -0.606 -0.146 -0.197 -0.542 -0.906 0.959 -0.545 -1.229 -1.299 -1.806 -1.814 -1.550 -1.331 -1.420 -1.593 -1.803 -1.585 0.042 0.082 -0.893 0.061 -0.054 -0.922 -0.994 -1.378 -1.404 -1.176 -0.718 -0.821 -0.946 -1.204 -0.952 -2.079 -2.071 -2.221 -2.172 -2.187 -2.144 -2.083 -2.246 -2.136 -2.185 -0.308 -0.317 -1.129 -0.519 -0.488 -0.476 -0.584 -0.677 -0.845 -0.686 -0.456 -0.555 -0.652 -0.814 -0.657 1.550 0.743 -3.235 0.157 0.255 -3.265 1.361 1.518 1.473 -1.075 2.395 0.491 -0.368 0.382 -0.275 -0.435
29 -1.882 -0.519 -4.223 -0.639 -0.639 3.768 -4.146 -1.862 -1.900 -1.552 -1.517 -1.310 -2.715 -2.099 -1.779 -0.146 -0.197 -0.542 -2.465 -1.043 -0.545 -1.538 -1.411 -1.408 -1.922 -1.679 -1.414 -1.450 -1.514 -1.829 -1.619 -0.434 -0.100 0.023 -0.130 -0.255 -1.186 -1.092 -1.081 -1.502 -1.292 -0.718 -0.821 -0.946 -1.204 -0.952 -2.256 -2.131 -2.063 -2.215 -2.252 -2.381 -2.161 -2.036 -2.191 -2.268 -0.701 -0.464 -0.404 -0.672 -0.651 -0.476 -0.584 -0.677 -0.845 -0.686 -0.456 -0.555 -0.652 -0.814 -0.657 3.068 0.918 0.780 0.520 0.295 0.356 0.830 1.404 1.137 -1.075 1.046 -0.721 -0.368 -0.750 -0.275 -0.435

Splitting the 3 Dataframes into Training and Testing Sets

As you know, the first basic step for regression is performing a train-test split.

In [312]:
from sklearn.model_selection import train_test_split

# We specify this so that the train and test data set always have the same rows, respectively

cameraaccessory_train, cameraaccessory_test = train_test_split(cameraaccessory_koy_df, \
                                                               train_size = 0.7, test_size = 0.3, random_state = 100)

gamingaccessory_train, gamingaccessory_test = train_test_split(gamingaccessory_koy_df, \
                                                               train_size = 0.7, test_size = 0.3, random_state = 100)

homeaudio_train, homeaudio_test = train_test_split(homeaudio_koy_df, \
                                                               train_size = 0.7, test_size = 0.3, random_state = 100)

Dividing the 3 dataframes into X and Y sets for the model building

In [313]:
y_cameraaccessory_train = cameraaccessory_train.pop('gmv')
X_cameraaccessory_train = cameraaccessory_train

y_gamingaccessory_train = gamingaccessory_train.pop('gmv')
X_gamingaccessory_train = gamingaccessory_train

y_homeaudio_train = homeaudio_train.pop('gmv')
X_homeaudio_train = homeaudio_train

X_homeaudio_train.head()
Out[313]:
gmv_lag1 Discount% deliverybdays deliverycdays sla product_procurement_sla is_cod is_mass_market product_vertical_djcontroller product_vertical_dock product_vertical_dockingstation product_vertical_fmradio product_vertical_hifisystem product_vertical_homeaudiospeaker product_vertical_karaokeplayer product_vertical_slingbox product_vertical_soundmixer product_vertical_voicerecorder payday_week holiday_week Total Investment Total Investment_SMA_3 Total Investment_SMA_5 Total Investment_EMA_8 Total_Investment_Ad_Stock TV TV_SMA_3 TV_SMA_5 TV_EMA_8 TV_Ad_Stock Digital Digital_SMA_3 Digital_SMA_5 Digital_EMA_8 Digital_Ad_Stock Sponsorship Sponsorship_SMA_3 Sponsorship_SMA_5 Sponsorship_EMA_8 Sponsorship_Ad_Stock Content Marketing Content Marketing_SMA_3 Content Marketing_SMA_5 Content Marketing_EMA_8 Content_Marketing_Ad_Stock Online marketing Online marketing_SMA_3 Online marketing_SMA_5 Online marketing_EMA_8 Online_marketing_Ad_Stock Affiliates Affiliates_SMA_3 Affiliates_SMA_5 Affiliates_EMA_8 Affiliates_Ad_Stock SEM SEM_SMA_3 SEM_SMA_5 SEM_EMA_8 SEM_Ad_Stock Radio Radio_SMA_3 Radio_SMA_5 Radio_EMA_8 Radio_Ad_Stock Other Other_SMA_3 Other_SMA_5 Other_EMA_8 Other_Ad_Stock NPS NPS_SMA_3 NPS_SMA_5 Stock Index Stock Index_SMA_3 Stock Index_SMA_5 Max Temp Min Temp Mean Temp Heat Deg Days Cool Deg Days Total Rain (mm) Total Snow (cm) Total Precip (mm) Snow on Grnd (cm) Sale
12 -0.417 -0.416 1.839 1.854 0.152 0.222 -1.768 -0.455 0.461 -0.318 0.999 -0.180 0.675 -0.598 -0.146 6.119 0.223 1.148 -1.043 -0.545 -0.283 0.470 0.708 0.394 0.331 0.628 1.853 2.252 1.697 1.682 -0.482 -0.222 -0.131 -0.437 -0.329 -0.242 0.280 0.475 0.067 0.152 -0.678 -0.493 -0.500 -0.515 -0.587 0.125 0.344 0.410 0.465 0.333 0.358 0.556 0.607 0.631 0.540 -0.413 -0.340 -0.310 -0.566 -0.430 -0.476 0.435 0.651 0.566 0.256 -0.456 1.162 1.617 1.283 0.913 0.769 0.292 0.340 0.970 -0.101 -0.019 -0.978 -1.278 -1.209 1.287 -0.641 -0.155 0.330 -0.076 -0.275 -0.435
10 -0.023 -0.239 -0.347 -0.317 -0.099 -0.403 -0.385 -0.176 1.267 -0.638 -0.422 0.202 0.602 -0.402 -0.146 -0.197 -0.159 0.582 -1.043 1.834 0.744 0.842 0.268 0.409 0.604 2.237 2.432 1.312 1.529 1.995 -0.094 -0.074 -0.124 -0.345 -0.156 0.454 0.538 0.017 -0.040 0.280 -0.307 -0.351 -0.341 -0.292 -0.361 0.404 0.438 0.319 0.510 0.409 0.612 0.640 0.461 0.640 0.583 -0.286 -0.292 -0.363 -0.540 -0.391 0.771 0.945 0.319 0.869 0.765 1.658 2.021 1.050 1.620 1.733 -0.022 0.201 0.337 -1.866 -0.413 0.096 -0.021 -0.707 -0.695 0.656 -0.641 2.802 -0.368 2.538 -0.275 -0.435
32 -0.238 -0.106 -0.639 -0.639 0.583 0.241 -0.268 -0.215 1.267 0.561 -0.600 0.148 0.237 -0.508 -0.146 -0.197 -0.542 -0.871 0.959 -0.545 0.654 -0.049 -0.680 -0.858 -0.110 0.108 -0.414 -0.888 -1.051 -0.499 -0.329 -0.382 -0.418 -0.575 -0.453 1.308 0.563 -0.132 -0.130 0.590 -0.042 -0.305 -0.588 -0.745 -0.375 0.106 -0.659 -1.257 -1.303 -0.740 0.041 -0.723 -1.288 -1.335 -0.797 -0.154 -0.349 -0.525 -0.710 -0.421 -0.476 -0.584 -0.677 -0.845 -0.686 -0.456 -0.555 -0.652 -0.814 -0.657 -0.613 0.421 0.714 -0.792 0.086 0.299 0.723 1.175 1.011 -1.075 0.538 -0.509 -0.368 -0.552 -0.275 -0.435
22 -0.383 -1.292 1.691 1.694 -0.740 0.267 -0.438 -0.498 -0.344 -0.878 0.644 -0.186 -1.515 -0.600 -0.146 -0.197 2.137 1.007 0.959 1.834 -0.616 -0.465 -0.304 -0.389 -0.510 -0.763 -0.886 -0.993 -0.739 -0.925 -0.309 -0.416 -0.508 -0.787 -0.508 -0.212 -0.199 -0.152 -0.196 -0.209 -0.669 -0.518 -0.366 -0.590 -0.599 -1.115 -0.562 -0.102 -0.095 -0.539 -1.064 -0.589 -0.205 -0.137 -0.558 -0.476 -0.418 -0.341 -0.606 -0.492 -0.476 -0.070 0.328 0.214 -0.094 -0.456 -0.341 -0.226 -0.253 -0.402 0.399 0.243 0.269 0.370 0.372 0.453 1.308 0.662 1.097 -1.015 1.081 -0.509 -0.368 -0.552 -0.275 -0.435
45 0.818 0.693 -0.639 -0.639 0.210 0.979 0.194 -0.282 0.461 0.122 0.289 -0.218 0.164 0.107 -0.146 -0.197 -0.542 -0.410 0.959 -0.545 0.904 0.449 0.045 0.624 0.574 0.703 0.491 0.304 0.527 0.560 0.210 0.028 -0.138 0.394 0.121 1.063 0.538 0.055 0.718 0.676 0.461 0.136 -0.202 0.468 0.247 0.994 0.699 0.459 0.659 0.754 0.910 0.685 0.507 0.631 0.724 0.494 0.244 0.015 0.626 0.369 -0.476 -0.584 -0.677 -0.845 -0.686 -0.456 -0.555 -0.652 -0.814 -0.657 -0.930 -0.073 0.119 -1.579 -0.050 0.271 -1.616 -0.878 -1.364 1.477 -0.641 0.170 -0.368 0.082 -0.275 -0.435

Dividing into X and Y test sets for the model building for 3 dataframes

In [314]:
y_cameraaccessory_test = cameraaccessory_test.pop('gmv')
X_cameraaccessory_test = cameraaccessory_test

y_gamingaccessory_test = gamingaccessory_test.pop('gmv')
X_gamingaccessory_test = gamingaccessory_test

y_homeaudio_test = homeaudio_test.pop('gmv')
X_homeaudio_test = homeaudio_test

X_homeaudio_test.head()
Out[314]:
gmv_lag1 Discount% deliverybdays deliverycdays sla product_procurement_sla is_cod is_mass_market product_vertical_djcontroller product_vertical_dock product_vertical_dockingstation product_vertical_fmradio product_vertical_hifisystem product_vertical_homeaudiospeaker product_vertical_karaokeplayer product_vertical_slingbox product_vertical_soundmixer product_vertical_voicerecorder payday_week holiday_week Total Investment Total Investment_SMA_3 Total Investment_SMA_5 Total Investment_EMA_8 Total_Investment_Ad_Stock TV TV_SMA_3 TV_SMA_5 TV_EMA_8 TV_Ad_Stock Digital Digital_SMA_3 Digital_SMA_5 Digital_EMA_8 Digital_Ad_Stock Sponsorship Sponsorship_SMA_3 Sponsorship_SMA_5 Sponsorship_EMA_8 Sponsorship_Ad_Stock Content Marketing Content Marketing_SMA_3 Content Marketing_SMA_5 Content Marketing_EMA_8 Content_Marketing_Ad_Stock Online marketing Online marketing_SMA_3 Online marketing_SMA_5 Online marketing_EMA_8 Online_marketing_Ad_Stock Affiliates Affiliates_SMA_3 Affiliates_SMA_5 Affiliates_EMA_8 Affiliates_Ad_Stock SEM SEM_SMA_3 SEM_SMA_5 SEM_EMA_8 SEM_Ad_Stock Radio Radio_SMA_3 Radio_SMA_5 Radio_EMA_8 Radio_Ad_Stock Other Other_SMA_3 Other_SMA_5 Other_EMA_8 Other_Ad_Stock NPS NPS_SMA_3 NPS_SMA_5 Stock Index Stock Index_SMA_3 Stock Index_SMA_5 Max Temp Min Temp Mean Temp Heat Deg Days Cool Deg Days Total Rain (mm) Total Snow (cm) Total Precip (mm) Snow on Grnd (cm) Sale
31 -0.358 0.004 -0.639 -0.639 -0.191 1.508 -0.054 0.001 1.267 0.601 -0.777 0.562 0.456 -0.388 -0.146 -0.197 -0.542 0.404 -1.043 -0.545 0.654 -0.842 -1.187 -1.428 -0.761 0.108 -0.962 -1.239 -1.417 -0.936 -0.329 -0.422 -0.445 -0.565 -0.478 1.308 -0.362 -0.734 -0.825 -0.172 -0.042 -0.563 -0.767 -0.946 -0.592 0.106 -1.455 -1.731 -1.745 -1.339 0.041 -1.520 -1.753 -1.764 -1.389 -0.154 -0.554 -0.662 -0.842 -0.587 -0.476 -0.584 -0.677 -0.845 -0.686 -0.456 -0.555 -0.652 -0.814 -0.657 -0.613 0.844 0.902 -0.792 0.230 0.363 1.255 0.719 0.860 -0.895 0.524 -0.549 -0.368 -0.589 -0.275 -0.435
5 0.003 1.370 -0.636 -0.636 -0.740 -0.306 0.910 0.520 -1.149 0.561 -0.600 0.100 -0.566 0.971 -0.146 -0.197 -0.542 0.546 0.959 1.834 -0.721 -0.747 -0.355 -0.276 -0.599 -0.604 -0.635 -0.236 -0.133 -0.452 -0.273 -0.279 -0.441 -0.538 -0.411 -0.843 -0.906 -1.022 -0.953 -1.015 -0.193 -0.219 -0.002 0.096 -0.113 0.038 0.068 0.506 0.623 0.314 0.105 0.138 0.579 0.668 0.384 -0.455 -0.482 -0.470 -0.496 -0.528 -0.476 -0.584 1.378 1.153 0.333 -0.456 -0.555 1.276 1.017 0.287 0.346 0.328 0.301 0.720 0.441 0.268 -0.340 -1.677 -0.759 0.734 -0.641 -0.820 -0.368 -0.842 0.560 1.523
9 0.750 0.096 -0.582 -0.583 -0.439 -0.563 0.139 -0.055 -1.149 -0.318 0.822 0.051 -0.420 0.022 -0.146 -0.197 0.606 0.546 0.959 1.834 0.744 0.312 -0.071 0.160 0.353 2.237 1.410 0.656 1.043 1.517 -0.094 -0.142 -0.171 -0.391 -0.195 0.454 0.057 -0.296 -0.304 0.041 -0.307 -0.307 -0.311 -0.233 -0.331 0.404 0.314 0.245 0.493 0.365 0.612 0.473 0.364 0.604 0.520 -0.286 -0.356 -0.405 -0.557 -0.421 0.771 0.435 -0.013 0.741 0.566 1.658 1.162 0.483 1.283 1.356 -0.022 0.244 0.356 -1.866 -0.129 0.223 -0.500 -0.593 -0.329 0.208 -0.641 -0.679 -0.368 -0.710 -0.275 -0.435
3 0.001 1.464 -0.638 -0.638 -0.951 0.955 1.140 0.450 0.059 -0.398 -0.600 0.229 -0.566 1.123 -0.146 -0.197 -0.542 0.511 -1.043 -0.545 -0.721 -0.132 0.038 0.208 -0.154 -0.604 0.019 0.183 0.288 -0.015 -0.273 -0.537 -0.617 -0.557 -0.536 -0.843 -1.060 -1.123 -0.745 -0.996 -0.193 0.138 0.245 0.355 0.132 0.038 0.751 0.913 0.922 0.718 0.105 0.842 0.989 0.952 0.787 -0.455 -0.453 -0.451 -0.313 -0.454 -0.476 2.572 3.433 2.458 2.144 -0.456 2.363 3.203 2.213 1.965 0.346 0.119 0.207 0.720 -0.027 0.060 -0.819 -0.935 -0.985 1.012 -0.641 0.142 -0.368 0.056 -0.275 0.870
18 -0.346 -0.363 1.622 1.619 -0.776 0.102 -0.432 -0.463 1.267 -1.477 1.354 -0.385 -1.077 -0.590 -0.146 -0.197 0.989 1.396 0.959 -0.545 -0.150 -0.128 -0.150 -0.061 -0.143 -0.986 -1.047 -0.351 -0.081 -0.696 -0.562 -0.609 -0.598 -0.862 -0.696 -0.200 -0.190 -0.166 -0.161 -0.193 -0.020 -0.022 -0.371 -0.399 -0.197 0.474 0.509 0.396 0.511 0.477 0.235 0.267 0.341 0.461 0.324 -0.242 -0.243 -0.309 -0.514 -0.333 0.782 0.959 0.328 0.623 0.743 0.073 0.089 -0.226 0.128 -0.004 -0.505 0.035 0.306 0.795 0.465 0.512 0.032 0.149 0.073 -0.286 -0.641 -0.735 -0.368 -0.763 -0.275 -0.435

Building Linear Regression model for cameraaccessory

In [315]:
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
from sklearn.metrics import mean_squared_error

cameraaccessory_model = LinearRegression().fit(X_cameraaccessory_train, y_cameraaccessory_train)
y_cameraaccessory_test_pred = cameraaccessory_model.predict(X_cameraaccessory_test)

print('R2 Score: {}'.format(r2_score(y_cameraaccessory_test, y_cameraaccessory_test_pred)))
print('Mean Squared Error: {}'.format(mean_squared_error(y_cameraaccessory_test, y_cameraaccessory_test_pred)))
R2 Score: 0.8378921655684648
Mean Squared Error: 0.16043753688743248
With Simple Linear Regression, we get a r2 score of 0.84 and mse of 0.16

Building Linear Regression model for cameraaccessory using K-fold Cross Validation

We will use GridSearchCV method and 5 fold cross validation method for our linear regression.

In [316]:
y_cameraaccessory = cameraaccessory_koy_df.pop('gmv')
X_cameraaccessory = cameraaccessory_koy_df
In [317]:
# Make cross validated predictions
from sklearn.model_selection import cross_val_score,cross_val_predict
from sklearn import metrics

cameraaccessory_model_cv = LinearRegression().fit(X_cameraaccessory, y_cameraaccessory)
cameraaccessory_predictions_cv = cross_val_predict(cameraaccessory_model_cv, X_cameraaccessory, y_cameraaccessory, cv=10)
accuracy = metrics.r2_score(y_cameraaccessory, cameraaccessory_predictions_cv)
print("Cross-Predicted Accuracy:", accuracy)
print('Mean Squared Error: {}'.format(mean_squared_error(y_cameraaccessory, cameraaccessory_predictions_cv)))
Cross-Predicted Accuracy: 0.2738719547779054
Mean Squared Error: 0.7261280452220947
With Simple Linear Regression, using cross validation, we get a r2 score of 0.27 and mse score of 0.73

Determining Feature Importance for cameraaccessory with model without cv

In [318]:
# linear regression model parameters
#Limiting floats output to 3 decimal points
pd.set_option('display.float_format', lambda x: '{:.3f}'.format(x)) 
pd.set_option('display.precision',1)


cameraaccessory_lr_model_parameters = list(cameraaccessory_model.coef_)
cameraaccessory_lr_model_parameters.insert(0, cameraaccessory_model.intercept_)
cameraaccessory_lr_model_parameters = [round(x, 3) for x in cameraaccessory_lr_model_parameters]
cols = X_cameraaccessory_test.columns
cols = cols.insert(0, "constant")
cameraaccessory_lr_coef = list(zip(cols, cameraaccessory_lr_model_parameters))
cameraaccessory_lr_coef
Out[318]:
[('constant', -0.032),
 ('gmv_lag1', -0.128),
 ('Discount%', -0.009),
 ('deliverybdays', -0.047),
 ('deliverycdays', -0.047),
 ('sla', 0.055),
 ('product_procurement_sla', -0.066),
 ('is_cod', 0.02),
 ('is_mass_market', 0.056),
 ('product_vertical_cameraaccessory', 0.046),
 ('product_vertical_camerabag', 0.22),
 ('product_vertical_camerabattery', 0.15),
 ('product_vertical_camerabatterycharger', -0.017),
 ('product_vertical_camerabatterygrip', 0.022),
 ('product_vertical_cameraeyecup', 0.021),
 ('product_vertical_camerafilmrolls', -0.096),
 ('product_vertical_camerahousing', 0.152),
 ('product_vertical_cameraledlight', -0.026),
 ('product_vertical_cameramicrophone', -0.094),
 ('product_vertical_cameramount', 0.112),
 ('product_vertical_cameraremotecontrol', -0.008),
 ('product_vertical_cameratripod', 0.001),
 ('product_vertical_extensiontube', 0.091),
 ('product_vertical_filter', 0.118),
 ('product_vertical_flash', -0.049),
 ('product_vertical_flashshoeadapter', -0.026),
 ('product_vertical_lens', 0.337),
 ('product_vertical_reflectorumbrella', 0.016),
 ('product_vertical_softbox', 0.056),
 ('product_vertical_strap', 0.008),
 ('product_vertical_teleconverter', 0.0),
 ('product_vertical_telescope', -0.008),
 ('payday_week', -0.069),
 ('holiday_week', -0.008),
 ('Total Investment', -0.019),
 ('Total Investment_SMA_3', 0.014),
 ('Total Investment_SMA_5', -0.029),
 ('Total Investment_EMA_8', -0.01),
 ('Total_Investment_Ad_Stock', -0.003),
 ('TV', -0.046),
 ('TV_SMA_3', -0.028),
 ('TV_SMA_5', 0.045),
 ('TV_EMA_8', 0.033),
 ('TV_Ad_Stock', 0.007),
 ('Digital', -0.084),
 ('Digital_SMA_3', 0.042),
 ('Digital_SMA_5', -0.012),
 ('Digital_EMA_8', -0.011),
 ('Digital_Ad_Stock', -0.024),
 ('Sponsorship', -0.048),
 ('Sponsorship_SMA_3', 0.019),
 ('Sponsorship_SMA_5', -0.066),
 ('Sponsorship_EMA_8', -0.033),
 ('Sponsorship_Ad_Stock', -0.013),
 ('Content Marketing', 0.001),
 ('Content Marketing_SMA_3', 0.071),
 ('Content Marketing_SMA_5', -0.012),
 ('Content Marketing_EMA_8', -0.006),
 ('Content_Marketing_Ad_Stock', 0.017),
 ('Online marketing', 0.143),
 ('Online marketing_SMA_3', -0.0),
 ('Online marketing_SMA_5', 0.008),
 ('Online marketing_EMA_8', 0.024),
 ('Online_marketing_Ad_Stock', 0.038),
 ('Affiliates', 0.118),
 ('Affiliates_SMA_3', -0.023),
 ('Affiliates_SMA_5', 0.018),
 ('Affiliates_EMA_8', 0.025),
 ('Affiliates_Ad_Stock', 0.029),
 ('SEM', -0.021),
 ('SEM_SMA_3', 0.061),
 ('SEM_SMA_5', -0.005),
 ('SEM_EMA_8', 0.006),
 ('SEM_Ad_Stock', 0.006),
 ('Radio', 0.023),
 ('Radio_SMA_3', -0.026),
 ('Radio_SMA_5', 0.018),
 ('Radio_EMA_8', -0.014),
 ('Radio_Ad_Stock', -0.011),
 ('Other', -0.055),
 ('Other_SMA_3', -0.05),
 ('Other_SMA_5', 0.029),
 ('Other_EMA_8', -0.012),
 ('Other_Ad_Stock', -0.031),
 ('NPS', -0.025),
 ('NPS_SMA_3', 0.04),
 ('NPS_SMA_5', -0.012),
 ('Stock Index', -0.068),
 ('Stock Index_SMA_3', -0.073),
 ('Stock Index_SMA_5', -0.038),
 ('Max Temp', 0.062),
 ('Min Temp', -0.022),
 ('Mean Temp', 0.052),
 ('Heat Deg Days', -0.076),
 ('Cool Deg Days', -0.038),
 ('Total Rain (mm)', 0.024),
 ('Total Snow (cm)', -0.025),
 ('Total Precip (mm)', 0.017),
 ('Snow on Grnd (cm)', 0.016),
 ('Sale', 0.013)]
In [319]:
cameraaccessory_lr_coef_df = pd.DataFrame(cameraaccessory_lr_coef)
col_rename = {0:'Features',1: 'Coefficients'}
cameraaccessory_lr_coef_df = cameraaccessory_lr_coef_df.rename(columns=col_rename)
cameraaccessory_lr_coef_df = cameraaccessory_lr_coef_df.iloc[1:,:]
cameraaccessory_lr_coef_df = cameraaccessory_lr_coef_df.loc[cameraaccessory_lr_coef_df['Coefficients']!=0.0]
cameraaccessory_lr_coef_df = cameraaccessory_lr_coef_df.sort_values(by=['Coefficients'], ascending = False)
cameraaccessory_lr_coef_df
Out[319]:
Features Coefficients
26 product_vertical_lens 0.337
10 product_vertical_camerabag 0.220
16 product_vertical_camerahousing 0.152
11 product_vertical_camerabattery 0.150
59 Online marketing 0.143
23 product_vertical_filter 0.118
64 Affiliates 0.118
19 product_vertical_cameramount 0.112
22 product_vertical_extensiontube 0.091
55 Content Marketing_SMA_3 0.071
90 Max Temp 0.062
70 SEM_SMA_3 0.061
8 is_mass_market 0.056
28 product_vertical_softbox 0.056
5 sla 0.055
92 Mean Temp 0.052
9 product_vertical_cameraaccessory 0.046
41 TV_SMA_5 0.045
45 Digital_SMA_3 0.042
85 NPS_SMA_3 0.040
63 Online_marketing_Ad_Stock 0.038
42 TV_EMA_8 0.033
81 Other_SMA_5 0.029
68 Affiliates_Ad_Stock 0.029
67 Affiliates_EMA_8 0.025
95 Total Rain (mm) 0.024
62 Online marketing_EMA_8 0.024
74 Radio 0.023
13 product_vertical_camerabatterygrip 0.022
14 product_vertical_cameraeyecup 0.021
7 is_cod 0.020
50 Sponsorship_SMA_3 0.019
66 Affiliates_SMA_5 0.018
76 Radio_SMA_5 0.018
58 Content_Marketing_Ad_Stock 0.017
97 Total Precip (mm) 0.017
27 product_vertical_reflectorumbrella 0.016
98 Snow on Grnd (cm) 0.016
35 Total Investment_SMA_3 0.014
99 Sale 0.013
29 product_vertical_strap 0.008
61 Online marketing_SMA_5 0.008
43 TV_Ad_Stock 0.007
73 SEM_Ad_Stock 0.006
72 SEM_EMA_8 0.006
54 Content Marketing 0.001
21 product_vertical_cameratripod 0.001
38 Total_Investment_Ad_Stock -0.003
71 SEM_SMA_5 -0.005
57 Content Marketing_EMA_8 -0.006
31 product_vertical_telescope -0.008
33 holiday_week -0.008
20 product_vertical_cameraremotecontrol -0.008
2 Discount% -0.009
37 Total Investment_EMA_8 -0.010
47 Digital_EMA_8 -0.011
78 Radio_Ad_Stock -0.011
86 NPS_SMA_5 -0.012
82 Other_EMA_8 -0.012
46 Digital_SMA_5 -0.012
56 Content Marketing_SMA_5 -0.012
53 Sponsorship_Ad_Stock -0.013
77 Radio_EMA_8 -0.014
12 product_vertical_camerabatterycharger -0.017
34 Total Investment -0.019
69 SEM -0.021
91 Min Temp -0.022
65 Affiliates_SMA_3 -0.023
48 Digital_Ad_Stock -0.024
96 Total Snow (cm) -0.025
84 NPS -0.025
25 product_vertical_flashshoeadapter -0.026
17 product_vertical_cameraledlight -0.026
75 Radio_SMA_3 -0.026
40 TV_SMA_3 -0.028
36 Total Investment_SMA_5 -0.029
83 Other_Ad_Stock -0.031
52 Sponsorship_EMA_8 -0.033
89 Stock Index_SMA_5 -0.038
94 Cool Deg Days -0.038
39 TV -0.046
3 deliverybdays -0.047
4 deliverycdays -0.047
49 Sponsorship -0.048
24 product_vertical_flash -0.049
80 Other_SMA_3 -0.050
79 Other -0.055
51 Sponsorship_SMA_5 -0.066
6 product_procurement_sla -0.066
87 Stock Index -0.068
32 payday_week -0.069
88 Stock Index_SMA_3 -0.073
93 Heat Deg Days -0.076
44 Digital -0.084
18 product_vertical_cameramicrophone -0.094
15 product_vertical_camerafilmrolls -0.096
1 gmv_lag1 -0.128

Determining Total Effect of KPIs from Koyck model

Equation for Koyck Model:

Yt = α + µYt-1 + β1X1 + β2X2 + β3X3 + β4X4 + β5X5 + ϵ -- (sale at time t is dependent on sale at time t-1)

If X1 is the advertising effect, β1 is the current effect of advertising, carry over effect of advertising is β1 µ/(1-µ).*

Therefore the total effect of advertising = Current effect + Carry over effect

                                        = β1 + β1 * µ/(1-µ) 

                                        = β1/(1-µ)
Calculating the coefficient of lag variable
In [320]:
cameraaccessory_lr_coef_df.loc[cameraaccessory_lr_coef_df['Features'] == 'gmv_lag1'].Coefficients
Out[320]:
1   -0.128
Name: Coefficients, dtype: float64
In [321]:
cameraaccessory_lr_coef_df['Total Effect'] = cameraaccessory_lr_coef_df['Coefficients']/(1-(-0.128))
cameraaccessory_lr_coef_df
Out[321]:
Features Coefficients Total Effect
26 product_vertical_lens 0.337 0.299
10 product_vertical_camerabag 0.220 0.195
16 product_vertical_camerahousing 0.152 0.135
11 product_vertical_camerabattery 0.150 0.133
59 Online marketing 0.143 0.127
23 product_vertical_filter 0.118 0.105
64 Affiliates 0.118 0.105
19 product_vertical_cameramount 0.112 0.099
22 product_vertical_extensiontube 0.091 0.081
55 Content Marketing_SMA_3 0.071 0.063
90 Max Temp 0.062 0.055
70 SEM_SMA_3 0.061 0.054
8 is_mass_market 0.056 0.050
28 product_vertical_softbox 0.056 0.050
5 sla 0.055 0.049
92 Mean Temp 0.052 0.046
9 product_vertical_cameraaccessory 0.046 0.041
41 TV_SMA_5 0.045 0.040
45 Digital_SMA_3 0.042 0.037
85 NPS_SMA_3 0.040 0.035
63 Online_marketing_Ad_Stock 0.038 0.034
42 TV_EMA_8 0.033 0.029
81 Other_SMA_5 0.029 0.026
68 Affiliates_Ad_Stock 0.029 0.026
67 Affiliates_EMA_8 0.025 0.022
95 Total Rain (mm) 0.024 0.021
62 Online marketing_EMA_8 0.024 0.021
74 Radio 0.023 0.020
13 product_vertical_camerabatterygrip 0.022 0.020
14 product_vertical_cameraeyecup 0.021 0.019
7 is_cod 0.020 0.018
50 Sponsorship_SMA_3 0.019 0.017
66 Affiliates_SMA_5 0.018 0.016
76 Radio_SMA_5 0.018 0.016
58 Content_Marketing_Ad_Stock 0.017 0.015
97 Total Precip (mm) 0.017 0.015
27 product_vertical_reflectorumbrella 0.016 0.014
98 Snow on Grnd (cm) 0.016 0.014
35 Total Investment_SMA_3 0.014 0.012
99 Sale 0.013 0.012
29 product_vertical_strap 0.008 0.007
61 Online marketing_SMA_5 0.008 0.007
43 TV_Ad_Stock 0.007 0.006
73 SEM_Ad_Stock 0.006 0.005
72 SEM_EMA_8 0.006 0.005
54 Content Marketing 0.001 0.001
21 product_vertical_cameratripod 0.001 0.001
38 Total_Investment_Ad_Stock -0.003 -0.003
71 SEM_SMA_5 -0.005 -0.004
57 Content Marketing_EMA_8 -0.006 -0.005
31 product_vertical_telescope -0.008 -0.007
33 holiday_week -0.008 -0.007
20 product_vertical_cameraremotecontrol -0.008 -0.007
2 Discount% -0.009 -0.008
37 Total Investment_EMA_8 -0.010 -0.009
47 Digital_EMA_8 -0.011 -0.010
78 Radio_Ad_Stock -0.011 -0.010
86 NPS_SMA_5 -0.012 -0.011
82 Other_EMA_8 -0.012 -0.011
46 Digital_SMA_5 -0.012 -0.011
56 Content Marketing_SMA_5 -0.012 -0.011
53 Sponsorship_Ad_Stock -0.013 -0.012
77 Radio_EMA_8 -0.014 -0.012
12 product_vertical_camerabatterycharger -0.017 -0.015
34 Total Investment -0.019 -0.017
69 SEM -0.021 -0.019
91 Min Temp -0.022 -0.020
65 Affiliates_SMA_3 -0.023 -0.020
48 Digital_Ad_Stock -0.024 -0.021
96 Total Snow (cm) -0.025 -0.022
84 NPS -0.025 -0.022
25 product_vertical_flashshoeadapter -0.026 -0.023
17 product_vertical_cameraledlight -0.026 -0.023
75 Radio_SMA_3 -0.026 -0.023
40 TV_SMA_3 -0.028 -0.025
36 Total Investment_SMA_5 -0.029 -0.026
83 Other_Ad_Stock -0.031 -0.027
52 Sponsorship_EMA_8 -0.033 -0.029
89 Stock Index_SMA_5 -0.038 -0.034
94 Cool Deg Days -0.038 -0.034
39 TV -0.046 -0.041
3 deliverybdays -0.047 -0.042
4 deliverycdays -0.047 -0.042
49 Sponsorship -0.048 -0.043
24 product_vertical_flash -0.049 -0.043
80 Other_SMA_3 -0.050 -0.044
79 Other -0.055 -0.049
51 Sponsorship_SMA_5 -0.066 -0.059
6 product_procurement_sla -0.066 -0.059
87 Stock Index -0.068 -0.060
32 payday_week -0.069 -0.061
88 Stock Index_SMA_3 -0.073 -0.065
93 Heat Deg Days -0.076 -0.067
44 Digital -0.084 -0.074
18 product_vertical_cameramicrophone -0.094 -0.083
15 product_vertical_camerafilmrolls -0.096 -0.085
1 gmv_lag1 -0.128 -0.113

Plotting the Features in descending order of Importance for cameraaccessory

In [322]:
# Slightly alter the figure size to make it more horizontal.
plt.figure(figsize=(10, 15), dpi=100, facecolor='w', edgecolor='k', frameon='True')
sns.barplot(y='Features', x='Total Effect', palette='husl', data=cameraaccessory_lr_coef_df, estimator=np.sum)
# Automatically adjust subplot params so that the subplotS fits in to the figure area.
plt.tight_layout()

# display the plot
plt.show()
The 5 most important features affecting GMV(Revenue) for cameraaccessory are:
Features Coefficients
product_vertical_lens 0.299
product_vertical_camerabag 0.195
product_vertical_camerahousing 0.135
product_vertical_camerabattery 0.133
Online marketing 0.127

Building Linear Regression model for gamingaccessory

In [323]:
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
from sklearn.metrics import mean_squared_error

gamingaccessory_model = LinearRegression().fit(X_gamingaccessory_train, y_gamingaccessory_train)
y_gamingaccessory_test_pred = gamingaccessory_model.predict(X_gamingaccessory_test)

print('R2 Score: {}'.format(r2_score(y_gamingaccessory_test, y_gamingaccessory_test_pred)))
print('Mean Squared Error: {}'.format(mean_squared_error(y_gamingaccessory_test, y_gamingaccessory_test_pred)))
R2 Score: 0.9320877359342649
Mean Squared Error: 0.054352830557210755
With Simple Linear Regression, we get a r2 score of 0.93 and mse of 0.05

Building Linear Regression model for gamingaccessory using K-fold Cross Validation

We will use GridSearchCV method and 5 fold cross validation method for our linear regression.

In [324]:
y_gamingaccessory = gamingaccessory_koy_df.pop('gmv')
X_gamingaccessory = gamingaccessory_koy_df
In [325]:
# Make cross validated predictions
from sklearn.model_selection import cross_val_score,cross_val_predict
from sklearn import metrics

gamingaccessory_model_cv = LinearRegression().fit(X_gamingaccessory, y_gamingaccessory)
gamingaccessory_predictions_cv = cross_val_predict(gamingaccessory_model_cv, X_gamingaccessory, y_gamingaccessory, cv=10)
accuracy = metrics.r2_score(y_gamingaccessory, gamingaccessory_predictions_cv)
print("Cross-Predicted Accuracy:", accuracy)
print('Mean Squared Error: {}'.format(mean_squared_error(y_gamingaccessory, gamingaccessory_predictions_cv)))
Cross-Predicted Accuracy: 0.49230878182346505
Mean Squared Error: 0.507691218176535
With Simple Linear Regression, using cross validation, we get a r2 score of 0.49 and mse score of 0.51

Determining Feature Importance for gamingaccessory with model without cv

In [326]:
# linear regression model parameters
#Limiting floats output to 3 decimal points
pd.set_option('display.float_format', lambda x: '{:.3f}'.format(x)) 
pd.set_option('display.precision',1)


gamingaccessory_lr_model_parameters = list(gamingaccessory_model.coef_)
gamingaccessory_lr_model_parameters.insert(0, gamingaccessory_model.intercept_)
gamingaccessory_lr_model_parameters = [round(x, 3) for x in gamingaccessory_lr_model_parameters]
cols = X_gamingaccessory_test.columns
cols = cols.insert(0, "constant")
gamingaccessory_lr_coef = list(zip(cols, gamingaccessory_lr_model_parameters))
gamingaccessory_lr_coef
Out[326]:
[('constant', 0.021),
 ('gmv_lag1', -0.015),
 ('Discount%', -0.004),
 ('deliverybdays', 0.018),
 ('deliverycdays', 0.018),
 ('sla', 0.061),
 ('product_procurement_sla', 0.03),
 ('is_cod', 0.059),
 ('is_mass_market', 0.165),
 ('product_vertical_gamecontrolmount', -0.013),
 ('product_vertical_gamepad', 0.202),
 ('product_vertical_gamingaccessorykit', 0.117),
 ('product_vertical_gamingadapter', 0.046),
 ('product_vertical_gamingchargingstation', 0.008),
 ('product_vertical_gamingheadset', 0.175),
 ('product_vertical_gamingkeyboard', 0.075),
 ('product_vertical_gamingmemorycard', 0.032),
 ('product_vertical_gamingmouse', 0.108),
 ('product_vertical_gamingmousepad', -0.0),
 ('product_vertical_gamingspeaker', 0.035),
 ('product_vertical_joystickgamingwheel', 0.086),
 ('product_vertical_motioncontroller', 0.053),
 ('product_vertical_tvoutcableaccessory', 0.092),
 ('payday_week', -0.04),
 ('holiday_week', -0.01),
 ('Total Investment', 0.003),
 ('Total Investment_SMA_3', -0.001),
 ('Total Investment_SMA_5', 0.02),
 ('Total Investment_EMA_8', -0.008),
 ('Total_Investment_Ad_Stock', 0.002),
 ('TV', 0.079),
 ('TV_SMA_3', -0.024),
 ('TV_SMA_5', 0.001),
 ('TV_EMA_8', 0.003),
 ('TV_Ad_Stock', 0.017),
 ('Digital', -0.024),
 ('Digital_SMA_3', 0.035),
 ('Digital_SMA_5', -0.007),
 ('Digital_EMA_8', -0.038),
 ('Digital_Ad_Stock', -0.002),
 ('Sponsorship', -0.019),
 ('Sponsorship_SMA_3', 0.014),
 ('Sponsorship_SMA_5', 0.031),
 ('Sponsorship_EMA_8', -0.006),
 ('Sponsorship_Ad_Stock', 0.004),
 ('Content Marketing', -0.037),
 ('Content Marketing_SMA_3', 0.018),
 ('Content Marketing_SMA_5', 0.014),
 ('Content Marketing_EMA_8', -0.029),
 ('Content_Marketing_Ad_Stock', -0.011),
 ('Online marketing', 0.019),
 ('Online marketing_SMA_3', -0.011),
 ('Online marketing_SMA_5', 0.006),
 ('Online marketing_EMA_8', -0.006),
 ('Online_marketing_Ad_Stock', -0.002),
 ('Affiliates', 0.051),
 ('Affiliates_SMA_3', 0.004),
 ('Affiliates_SMA_5', 0.012),
 ('Affiliates_EMA_8', 0.004),
 ('Affiliates_Ad_Stock', 0.015),
 ('SEM', -0.03),
 ('SEM_SMA_3', 0.025),
 ('SEM_SMA_5', -0.004),
 ('SEM_EMA_8', -0.037),
 ('SEM_Ad_Stock', -0.009),
 ('Radio', 0.042),
 ('Radio_SMA_3', -0.067),
 ('Radio_SMA_5', 0.014),
 ('Radio_EMA_8', 0.029),
 ('Radio_Ad_Stock', -0.002),
 ('Other', 0.062),
 ('Other_SMA_3', -0.077),
 ('Other_SMA_5', 0.009),
 ('Other_EMA_8', 0.029),
 ('Other_Ad_Stock', 0.002),
 ('NPS', -0.007),
 ('NPS_SMA_3', 0.029),
 ('NPS_SMA_5', -0.04),
 ('Stock Index', 0.044),
 ('Stock Index_SMA_3', 0.031),
 ('Stock Index_SMA_5', -0.047),
 ('Max Temp', -0.037),
 ('Min Temp', 0.053),
 ('Mean Temp', -0.012),
 ('Heat Deg Days', 0.041),
 ('Cool Deg Days', 0.081),
 ('Total Rain (mm)', -0.019),
 ('Total Snow (cm)', -0.001),
 ('Total Precip (mm)', -0.018),
 ('Snow on Grnd (cm)', -0.022),
 ('Sale', 0.042)]
In [327]:
gamingaccessory_lr_coef_df = pd.DataFrame(gamingaccessory_lr_coef)
col_rename = {0:'Features',1: 'Coefficients'}
gamingaccessory_lr_coef_df = gamingaccessory_lr_coef_df.rename(columns=col_rename)
gamingaccessory_lr_coef_df = gamingaccessory_lr_coef_df.iloc[1:,:]
gamingaccessory_lr_coef_df = gamingaccessory_lr_coef_df.loc[gamingaccessory_lr_coef_df['Coefficients']!=0.0]
gamingaccessory_lr_coef_df = gamingaccessory_lr_coef_df.sort_values(by=['Coefficients'], ascending = False)
gamingaccessory_lr_coef_df
Out[327]:
Features Coefficients
10 product_vertical_gamepad 0.202
14 product_vertical_gamingheadset 0.175
8 is_mass_market 0.165
11 product_vertical_gamingaccessorykit 0.117
17 product_vertical_gamingmouse 0.108
22 product_vertical_tvoutcableaccessory 0.092
20 product_vertical_joystickgamingwheel 0.086
85 Cool Deg Days 0.081
30 TV 0.079
15 product_vertical_gamingkeyboard 0.075
70 Other 0.062
5 sla 0.061
7 is_cod 0.059
82 Min Temp 0.053
21 product_vertical_motioncontroller 0.053
55 Affiliates 0.051
12 product_vertical_gamingadapter 0.046
78 Stock Index 0.044
65 Radio 0.042
90 Sale 0.042
84 Heat Deg Days 0.041
19 product_vertical_gamingspeaker 0.035
36 Digital_SMA_3 0.035
16 product_vertical_gamingmemorycard 0.032
79 Stock Index_SMA_3 0.031
42 Sponsorship_SMA_5 0.031
6 product_procurement_sla 0.030
76 NPS_SMA_3 0.029
73 Other_EMA_8 0.029
68 Radio_EMA_8 0.029
61 SEM_SMA_3 0.025
27 Total Investment_SMA_5 0.020
50 Online marketing 0.019
46 Content Marketing_SMA_3 0.018
4 deliverycdays 0.018
3 deliverybdays 0.018
34 TV_Ad_Stock 0.017
59 Affiliates_Ad_Stock 0.015
47 Content Marketing_SMA_5 0.014
67 Radio_SMA_5 0.014
41 Sponsorship_SMA_3 0.014
57 Affiliates_SMA_5 0.012
72 Other_SMA_5 0.009
13 product_vertical_gamingchargingstation 0.008
52 Online marketing_SMA_5 0.006
44 Sponsorship_Ad_Stock 0.004
58 Affiliates_EMA_8 0.004
56 Affiliates_SMA_3 0.004
33 TV_EMA_8 0.003
25 Total Investment 0.003
74 Other_Ad_Stock 0.002
29 Total_Investment_Ad_Stock 0.002
32 TV_SMA_5 0.001
87 Total Snow (cm) -0.001
26 Total Investment_SMA_3 -0.001
69 Radio_Ad_Stock -0.002
54 Online_marketing_Ad_Stock -0.002
39 Digital_Ad_Stock -0.002
62 SEM_SMA_5 -0.004
2 Discount% -0.004
53 Online marketing_EMA_8 -0.006
43 Sponsorship_EMA_8 -0.006
37 Digital_SMA_5 -0.007
75 NPS -0.007
28 Total Investment_EMA_8 -0.008
64 SEM_Ad_Stock -0.009
24 holiday_week -0.010
51 Online marketing_SMA_3 -0.011
49 Content_Marketing_Ad_Stock -0.011
83 Mean Temp -0.012
9 product_vertical_gamecontrolmount -0.013
1 gmv_lag1 -0.015
88 Total Precip (mm) -0.018
86 Total Rain (mm) -0.019
40 Sponsorship -0.019
89 Snow on Grnd (cm) -0.022
35 Digital -0.024
31 TV_SMA_3 -0.024
48 Content Marketing_EMA_8 -0.029
60 SEM -0.030
81 Max Temp -0.037
45 Content Marketing -0.037
63 SEM_EMA_8 -0.037
38 Digital_EMA_8 -0.038
23 payday_week -0.040
77 NPS_SMA_5 -0.040
80 Stock Index_SMA_5 -0.047
66 Radio_SMA_3 -0.067
71 Other_SMA_3 -0.077

Determining Total Effect of KPIs from Koyck model

Equation for Koyck Model:

Yt = α + µYt-1 + β1X1 + β2X2 + β3X3 + β4X4 + β5X5 + ϵ -- (sale at time t is dependent on sale at time t-1)

If X1 is the advertising effect, β1 is the current effect of advertising, carry over effect of advertising is β1 µ/(1-µ).*

Therefore the total effect of advertising = Current effect + Carry over effect

                                        = β1 + β1 * µ/(1-µ) 

                                        = β1/(1-µ)
Calculating the coefficient of lag variable
In [328]:
gamingaccessory_lr_coef_df.loc[gamingaccessory_lr_coef_df['Features'] == 'gmv_lag1'].Coefficients
Out[328]:
1   -0.015
Name: Coefficients, dtype: float64
In [329]:
gamingaccessory_lr_coef_df['Total Effect'] = gamingaccessory_lr_coef_df['Coefficients']/(1-(-0.015))
gamingaccessory_lr_coef_df
Out[329]:
Features Coefficients Total Effect
10 product_vertical_gamepad 0.202 0.199
14 product_vertical_gamingheadset 0.175 0.172
8 is_mass_market 0.165 0.163
11 product_vertical_gamingaccessorykit 0.117 0.115
17 product_vertical_gamingmouse 0.108 0.106
22 product_vertical_tvoutcableaccessory 0.092 0.091
20 product_vertical_joystickgamingwheel 0.086 0.085
85 Cool Deg Days 0.081 0.080
30 TV 0.079 0.078
15 product_vertical_gamingkeyboard 0.075 0.074
70 Other 0.062 0.061
5 sla 0.061 0.060
7 is_cod 0.059 0.058
82 Min Temp 0.053 0.052
21 product_vertical_motioncontroller 0.053 0.052
55 Affiliates 0.051 0.050
12 product_vertical_gamingadapter 0.046 0.045
78 Stock Index 0.044 0.043
65 Radio 0.042 0.041
90 Sale 0.042 0.041
84 Heat Deg Days 0.041 0.040
19 product_vertical_gamingspeaker 0.035 0.034
36 Digital_SMA_3 0.035 0.034
16 product_vertical_gamingmemorycard 0.032 0.032
79 Stock Index_SMA_3 0.031 0.031
42 Sponsorship_SMA_5 0.031 0.031
6 product_procurement_sla 0.030 0.030
76 NPS_SMA_3 0.029 0.029
73 Other_EMA_8 0.029 0.029
68 Radio_EMA_8 0.029 0.029
61 SEM_SMA_3 0.025 0.025
27 Total Investment_SMA_5 0.020 0.020
50 Online marketing 0.019 0.019
46 Content Marketing_SMA_3 0.018 0.018
4 deliverycdays 0.018 0.018
3 deliverybdays 0.018 0.018
34 TV_Ad_Stock 0.017 0.017
59 Affiliates_Ad_Stock 0.015 0.015
47 Content Marketing_SMA_5 0.014 0.014
67 Radio_SMA_5 0.014 0.014
41 Sponsorship_SMA_3 0.014 0.014
57 Affiliates_SMA_5 0.012 0.012
72 Other_SMA_5 0.009 0.009
13 product_vertical_gamingchargingstation 0.008 0.008
52 Online marketing_SMA_5 0.006 0.006
44 Sponsorship_Ad_Stock 0.004 0.004
58 Affiliates_EMA_8 0.004 0.004
56 Affiliates_SMA_3 0.004 0.004
33 TV_EMA_8 0.003 0.003
25 Total Investment 0.003 0.003
74 Other_Ad_Stock 0.002 0.002
29 Total_Investment_Ad_Stock 0.002 0.002
32 TV_SMA_5 0.001 0.001
87 Total Snow (cm) -0.001 -0.001
26 Total Investment_SMA_3 -0.001 -0.001
69 Radio_Ad_Stock -0.002 -0.002
54 Online_marketing_Ad_Stock -0.002 -0.002
39 Digital_Ad_Stock -0.002 -0.002
62 SEM_SMA_5 -0.004 -0.004
2 Discount% -0.004 -0.004
53 Online marketing_EMA_8 -0.006 -0.006
43 Sponsorship_EMA_8 -0.006 -0.006
37 Digital_SMA_5 -0.007 -0.007
75 NPS -0.007 -0.007
28 Total Investment_EMA_8 -0.008 -0.008
64 SEM_Ad_Stock -0.009 -0.009
24 holiday_week -0.010 -0.010
51 Online marketing_SMA_3 -0.011 -0.011
49 Content_Marketing_Ad_Stock -0.011 -0.011
83 Mean Temp -0.012 -0.012
9 product_vertical_gamecontrolmount -0.013 -0.013
1 gmv_lag1 -0.015 -0.015
88 Total Precip (mm) -0.018 -0.018
86 Total Rain (mm) -0.019 -0.019
40 Sponsorship -0.019 -0.019
89 Snow on Grnd (cm) -0.022 -0.022
35 Digital -0.024 -0.024
31 TV_SMA_3 -0.024 -0.024
48 Content Marketing_EMA_8 -0.029 -0.029
60 SEM -0.030 -0.030
81 Max Temp -0.037 -0.036
45 Content Marketing -0.037 -0.036
63 SEM_EMA_8 -0.037 -0.036
38 Digital_EMA_8 -0.038 -0.037
23 payday_week -0.040 -0.039
77 NPS_SMA_5 -0.040 -0.039
80 Stock Index_SMA_5 -0.047 -0.046
66 Radio_SMA_3 -0.067 -0.066
71 Other_SMA_3 -0.077 -0.076

Plotting the Features in descending order of Importance for gamingaccessory

In [330]:
# Slightly alter the figure size to make it more horizontal.
plt.figure(figsize=(10, 15), dpi=100, facecolor='w', edgecolor='k', frameon='True')
sns.barplot(y='Features', x='Total Effect', palette='husl', data=gamingaccessory_lr_coef_df, estimator=np.sum)
# Automatically adjust subplot params so that the subplotS fits in to the figure area.
plt.tight_layout()

# display the plot
plt.show()
The 5 most important features affecting GMV(Revenue) for gamingaccessory are:
Features Coefficients
product_vertical_gamingheadset 0.199
product_vertical_gamingheadset 0.172
is_mass_market 0.163
product_vertical_gamingmouse 0.125
product_vertical_gamingaccessorykit 0.106

Building Linear Regression model for homeaudio

In [331]:
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
from sklearn.metrics import mean_squared_error

homeaudio_model = LinearRegression().fit(X_homeaudio_train, y_homeaudio_train)
y_homeaudio_test_pred = homeaudio_model.predict(X_homeaudio_test)

print('R2 Score: {}'.format(r2_score(y_homeaudio_test, y_homeaudio_test_pred)))
print('Mean Squared Error: {}'.format(mean_squared_error(y_homeaudio_test, y_homeaudio_test_pred)))
R2 Score: 0.9609857877052308
Mean Squared Error: 0.09354693362138522
With Simple Linear Regression, we get a r2 score of 0.96 and mse of 0.09

Building Linear Regression model for homeaudio using K-fold Cross Validation

We will use GridSearchCV method and 5 fold cross validation method for our linear regression.

In [332]:
y_homeaudio = homeaudio_koy_df.pop('gmv')
X_homeaudio = homeaudio_koy_df
In [333]:
# Make cross validated predictions
from sklearn.model_selection import cross_val_score,cross_val_predict
from sklearn import metrics

homeaudio_model_cv = LinearRegression().fit(X_homeaudio, y_homeaudio)
homeaudio_predictions_cv = cross_val_predict(homeaudio_model_cv, X_homeaudio, y_homeaudio, cv=5)
accuracy = metrics.r2_score(y_homeaudio, homeaudio_predictions_cv)
print("Cross-Predicted Accuracy:", accuracy)
print('Mean Squared Error: {}'.format(mean_squared_error(y_homeaudio, homeaudio_predictions_cv)))
Cross-Predicted Accuracy: 0.696145786955783
Mean Squared Error: 0.30385421304421695
With Simple Linear Regression, using cross validation, we get a r2 score of 0.70 and mse score of 0.30

Determining Feature Importance for homeaudio with model without cv

In [334]:
# linear regression model parameters
#Limiting floats output to 3 decimal points
pd.set_option('display.float_format', lambda x: '{:.3f}'.format(x)) 
pd.set_option('display.precision',1)


homeaudio_lr_model_parameters = list(homeaudio_model.coef_)
homeaudio_lr_model_parameters.insert(0, homeaudio_model.intercept_)
homeaudio_lr_model_parameters = [round(x, 3) for x in homeaudio_lr_model_parameters]
cols = homeaudio_test.columns
cols = cols.insert(0, "constant")
homeaudio_lr_coef = list(zip(cols, homeaudio_lr_model_parameters))
homeaudio_lr_coef
Out[334]:
[('constant', -0.008),
 ('gmv_lag1', 0.06),
 ('Discount%', 0.037),
 ('deliverybdays', -0.021),
 ('deliverycdays', -0.034),
 ('sla', -0.159),
 ('product_procurement_sla', -0.087),
 ('is_cod', 0.152),
 ('is_mass_market', 0.197),
 ('product_vertical_djcontroller', 0.037),
 ('product_vertical_dock', 0.025),
 ('product_vertical_dockingstation', 0.005),
 ('product_vertical_fmradio', 0.087),
 ('product_vertical_hifisystem', 0.073),
 ('product_vertical_homeaudiospeaker', 0.391),
 ('product_vertical_karaokeplayer', 0.0),
 ('product_vertical_slingbox', 0.056),
 ('product_vertical_soundmixer', 0.04),
 ('product_vertical_voicerecorder', 0.012),
 ('payday_week', 0.063),
 ('holiday_week', -0.052),
 ('Total Investment', -0.036),
 ('Total Investment_SMA_3', 0.015),
 ('Total Investment_SMA_5', -0.012),
 ('Total Investment_EMA_8', 0.054),
 ('Total_Investment_Ad_Stock', -0.052),
 ('TV', -0.004),
 ('TV_SMA_3', 0.003),
 ('TV_SMA_5', 0.032),
 ('TV_EMA_8', -0.006),
 ('TV_Ad_Stock', -0.018),
 ('Digital', 0.015),
 ('Digital_SMA_3', 0.066),
 ('Digital_SMA_5', -0.042),
 ('Digital_EMA_8', 0.055),
 ('Digital_Ad_Stock', -0.004),
 ('Sponsorship', -0.063),
 ('Sponsorship_SMA_3', 0.01),
 ('Sponsorship_SMA_5', -0.03),
 ('Sponsorship_EMA_8', 0.039),
 ('Sponsorship_Ad_Stock', -0.072),
 ('Content Marketing', 0.011),
 ('Content Marketing_SMA_3', 0.012),
 ('Content Marketing_SMA_5', -0.088),
 ('Content Marketing_EMA_8', 0.053),
 ('Content_Marketing_Ad_Stock', -0.04),
 ('Online marketing', -0.02),
 ('Online marketing_SMA_3', -0.023),
 ('Online marketing_SMA_5', 0.011),
 ('Online marketing_EMA_8', 0.061),
 ('Online_marketing_Ad_Stock', -0.041),
 ('Affiliates', 0.01),
 ('Affiliates_SMA_3', 0.012),
 ('Affiliates_SMA_5', 0.041),
 ('Affiliates_EMA_8', 0.072),
 ('Affiliates_Ad_Stock', -0.006),
 ('SEM', 0.021),
 ('SEM_SMA_3', 0.042),
 ('SEM_SMA_5', -0.046),
 ('SEM_EMA_8', 0.077),
 ('SEM_Ad_Stock', -0.017),
 ('Radio', -0.005),
 ('Radio_SMA_3', -0.05),
 ('Radio_SMA_5', 0.021),
 ('Radio_EMA_8', -0.014),
 ('Radio_Ad_Stock', -0.016),
 ('Other', -0.009),
 ('Other_SMA_3', 0.0),
 ('Other_SMA_5', 0.061),
 ('Other_EMA_8', -0.028),
 ('Other_Ad_Stock', 0.016),
 ('NPS', 0.113),
 ('NPS_SMA_3', 0.021),
 ('NPS_SMA_5', 0.012),
 ('Stock Index', -0.11),
 ('Stock Index_SMA_3', -0.025),
 ('Stock Index_SMA_5', 0.022),
 ('Max Temp', -0.078),
 ('Min Temp', 0.011),
 ('Mean Temp', 0.101),
 ('Heat Deg Days', -0.115),
 ('Cool Deg Days', 0.027),
 ('Total Rain (mm)', -0.03),
 ('Total Snow (cm)', 0.017),
 ('Total Precip (mm)', -0.024),
 ('Snow on Grnd (cm)', 0.021),
 ('Sale', -0.018)]
In [335]:
homeaudio_lr_coef_df = pd.DataFrame(homeaudio_lr_coef)
col_rename = {0:'Features',1: 'Coefficients'}
homeaudio_lr_coef_df = homeaudio_lr_coef_df.rename(columns=col_rename)
homeaudio_lr_coef_df = homeaudio_lr_coef_df.iloc[1:,:]
homeaudio_lr_coef_df = homeaudio_lr_coef_df.loc[homeaudio_lr_coef_df['Coefficients']!=0.0]
homeaudio_lr_coef_df = homeaudio_lr_coef_df.sort_values(by=['Coefficients'], ascending = False)
homeaudio_lr_coef_df
Out[335]:
Features Coefficients
14 product_vertical_homeaudiospeaker 0.391
8 is_mass_market 0.197
7 is_cod 0.152
71 NPS 0.113
79 Mean Temp 0.101
12 product_vertical_fmradio 0.087
59 SEM_EMA_8 0.077
13 product_vertical_hifisystem 0.073
54 Affiliates_EMA_8 0.072
32 Digital_SMA_3 0.066
19 payday_week 0.063
68 Other_SMA_5 0.061
49 Online marketing_EMA_8 0.061
1 gmv_lag1 0.060
16 product_vertical_slingbox 0.056
34 Digital_EMA_8 0.055
24 Total Investment_EMA_8 0.054
44 Content Marketing_EMA_8 0.053
57 SEM_SMA_3 0.042
53 Affiliates_SMA_5 0.041
17 product_vertical_soundmixer 0.040
39 Sponsorship_EMA_8 0.039
2 Discount% 0.037
9 product_vertical_djcontroller 0.037
28 TV_SMA_5 0.032
81 Cool Deg Days 0.027
10 product_vertical_dock 0.025
76 Stock Index_SMA_5 0.022
56 SEM 0.021
63 Radio_SMA_5 0.021
72 NPS_SMA_3 0.021
85 Snow on Grnd (cm) 0.021
83 Total Snow (cm) 0.017
70 Other_Ad_Stock 0.016
22 Total Investment_SMA_3 0.015
31 Digital 0.015
73 NPS_SMA_5 0.012
18 product_vertical_voicerecorder 0.012
52 Affiliates_SMA_3 0.012
42 Content Marketing_SMA_3 0.012
48 Online marketing_SMA_5 0.011
41 Content Marketing 0.011
78 Min Temp 0.011
37 Sponsorship_SMA_3 0.010
51 Affiliates 0.010
11 product_vertical_dockingstation 0.005
27 TV_SMA_3 0.003
35 Digital_Ad_Stock -0.004
26 TV -0.004
61 Radio -0.005
29 TV_EMA_8 -0.006
55 Affiliates_Ad_Stock -0.006
66 Other -0.009
23 Total Investment_SMA_5 -0.012
64 Radio_EMA_8 -0.014
65 Radio_Ad_Stock -0.016
60 SEM_Ad_Stock -0.017
86 Sale -0.018
30 TV_Ad_Stock -0.018
46 Online marketing -0.020
3 deliverybdays -0.021
47 Online marketing_SMA_3 -0.023
84 Total Precip (mm) -0.024
75 Stock Index_SMA_3 -0.025
69 Other_EMA_8 -0.028
82 Total Rain (mm) -0.030
38 Sponsorship_SMA_5 -0.030
4 deliverycdays -0.034
21 Total Investment -0.036
45 Content_Marketing_Ad_Stock -0.040
50 Online_marketing_Ad_Stock -0.041
33 Digital_SMA_5 -0.042
58 SEM_SMA_5 -0.046
62 Radio_SMA_3 -0.050
25 Total_Investment_Ad_Stock -0.052
20 holiday_week -0.052
36 Sponsorship -0.063
40 Sponsorship_Ad_Stock -0.072
77 Max Temp -0.078
6 product_procurement_sla -0.087
43 Content Marketing_SMA_5 -0.088
74 Stock Index -0.110
80 Heat Deg Days -0.115
5 sla -0.159

Determining Total Effect of KPIs from Koyck model

Equation for Koyck Model:

Yt = α + µYt-1 + β1X1 + β2X2 + β3X3 + β4X4 + β5X5 + ϵ -- (sale at time t is dependent on sale at time t-1)

If X1 is the advertising effect, β1 is the current effect of advertising, carry over effect of advertising is β1 µ/(1-µ).*

Therefore the total effect of advertising = Current effect + Carry over effect

                                        = β1 + β1 * µ/(1-µ) 

                                        = β1/(1-µ)
Calculating the coefficient of lag variable
In [336]:
homeaudio_lr_coef_df.loc[homeaudio_lr_coef_df['Features'] == 'gmv_lag1'].Coefficients
Out[336]:
1   0.060
Name: Coefficients, dtype: float64
In [337]:
homeaudio_lr_coef_df['Total Effect'] = homeaudio_lr_coef_df['Coefficients']/(1-(0.060))
homeaudio_lr_coef_df
Out[337]:
Features Coefficients Total Effect
14 product_vertical_homeaudiospeaker 0.391 0.416
8 is_mass_market 0.197 0.210
7 is_cod 0.152 0.162
71 NPS 0.113 0.120
79 Mean Temp 0.101 0.107
12 product_vertical_fmradio 0.087 0.093
59 SEM_EMA_8 0.077 0.082
13 product_vertical_hifisystem 0.073 0.078
54 Affiliates_EMA_8 0.072 0.077
32 Digital_SMA_3 0.066 0.070
19 payday_week 0.063 0.067
68 Other_SMA_5 0.061 0.065
49 Online marketing_EMA_8 0.061 0.065
1 gmv_lag1 0.060 0.064
16 product_vertical_slingbox 0.056 0.060
34 Digital_EMA_8 0.055 0.059
24 Total Investment_EMA_8 0.054 0.057
44 Content Marketing_EMA_8 0.053 0.056
57 SEM_SMA_3 0.042 0.045
53 Affiliates_SMA_5 0.041 0.044
17 product_vertical_soundmixer 0.040 0.043
39 Sponsorship_EMA_8 0.039 0.041
2 Discount% 0.037 0.039
9 product_vertical_djcontroller 0.037 0.039
28 TV_SMA_5 0.032 0.034
81 Cool Deg Days 0.027 0.029
10 product_vertical_dock 0.025 0.027
76 Stock Index_SMA_5 0.022 0.023
56 SEM 0.021 0.022
63 Radio_SMA_5 0.021 0.022
72 NPS_SMA_3 0.021 0.022
85 Snow on Grnd (cm) 0.021 0.022
83 Total Snow (cm) 0.017 0.018
70 Other_Ad_Stock 0.016 0.017
22 Total Investment_SMA_3 0.015 0.016
31 Digital 0.015 0.016
73 NPS_SMA_5 0.012 0.013
18 product_vertical_voicerecorder 0.012 0.013
52 Affiliates_SMA_3 0.012 0.013
42 Content Marketing_SMA_3 0.012 0.013
48 Online marketing_SMA_5 0.011 0.012
41 Content Marketing 0.011 0.012
78 Min Temp 0.011 0.012
37 Sponsorship_SMA_3 0.010 0.011
51 Affiliates 0.010 0.011
11 product_vertical_dockingstation 0.005 0.005
27 TV_SMA_3 0.003 0.003
35 Digital_Ad_Stock -0.004 -0.004
26 TV -0.004 -0.004
61 Radio -0.005 -0.005
29 TV_EMA_8 -0.006 -0.006
55 Affiliates_Ad_Stock -0.006 -0.006
66 Other -0.009 -0.010
23 Total Investment_SMA_5 -0.012 -0.013
64 Radio_EMA_8 -0.014 -0.015
65 Radio_Ad_Stock -0.016 -0.017
60 SEM_Ad_Stock -0.017 -0.018
86 Sale -0.018 -0.019
30 TV_Ad_Stock -0.018 -0.019
46 Online marketing -0.020 -0.021
3 deliverybdays -0.021 -0.022
47 Online marketing_SMA_3 -0.023 -0.024
84 Total Precip (mm) -0.024 -0.026
75 Stock Index_SMA_3 -0.025 -0.027
69 Other_EMA_8 -0.028 -0.030
82 Total Rain (mm) -0.030 -0.032
38 Sponsorship_SMA_5 -0.030 -0.032
4 deliverycdays -0.034 -0.036
21 Total Investment -0.036 -0.038
45 Content_Marketing_Ad_Stock -0.040 -0.043
50 Online_marketing_Ad_Stock -0.041 -0.044
33 Digital_SMA_5 -0.042 -0.045
58 SEM_SMA_5 -0.046 -0.049
62 Radio_SMA_3 -0.050 -0.053
25 Total_Investment_Ad_Stock -0.052 -0.055
20 holiday_week -0.052 -0.055
36 Sponsorship -0.063 -0.067
40 Sponsorship_Ad_Stock -0.072 -0.077
77 Max Temp -0.078 -0.083
6 product_procurement_sla -0.087 -0.093
43 Content Marketing_SMA_5 -0.088 -0.094
74 Stock Index -0.110 -0.117
80 Heat Deg Days -0.115 -0.122
5 sla -0.159 -0.169

Plotting the Features in descending order of Importance for homeaudio

In [338]:
# Slightly alter the figure size to make it more horizontal.
plt.figure(figsize=(10, 15), dpi=100, facecolor='w', edgecolor='k', frameon='True')
sns.barplot(y='Features', x='Total Effect', palette='husl', data=homeaudio_lr_coef_df, estimator=np.sum)
# Automatically adjust subplot params so that the subplotS fits in to the figure area.
plt.tight_layout()

# display the plot
plt.show()
The 5 most important features affecting GMV(Revenue) for homeaudio are:
Features Coefficients
product_vertical_homeaudiospeaker 0.416
is_mass_market 0.210
is_cod 0.162
NPS 0.120
Mean Temp 0.107
In [ ]:
 
In [ ]:
 

Distributive Lag Model(Additive)

The Additive and Multiplicative Linear Models that we have built till now can capture the current effect of the KPIs. However, if we want to capture the carry-over effect, we would want to model the current revenue figures based on the past figures of the KPIs.

In the distributed lag model, not only is the dependent variable entered in its lagged version, but the independent variables are as well. This is a more generalist model and captures the carry-over effect of all the variables. Thus, the Koyck model is a special type of distributed lag model, which includes the lag value of only the dependent variable.

Yt = α+ µ1Yt-1 + µ2Yt-2 + µ3Yt-3 + ....

    + β1X1t + β1X1t-1 + β1X1t-2 + ....

    + β2X2t + β2X2t-1 + β2X2t-2 + ....

    + β3X3t + β3X3t-1 + β3X3t-2 + ....

    + β4X4t + β4X4t-1 + β4X4t-2 + ....

    + β5X5t + β5X5t-1 + β5X5t-2 + ....

    + ϵ
In [339]:
homeaudio_org_df.head()
Out[339]:
Week gmv Discount% deliverybdays deliverycdays sla product_procurement_sla is_cod is_mass_market product_vertical_djcontroller product_vertical_dock product_vertical_dockingstation product_vertical_fmradio product_vertical_hifisystem product_vertical_homeaudiospeaker product_vertical_karaokeplayer product_vertical_slingbox product_vertical_soundmixer product_vertical_voicerecorder payday_week holiday_week Total Investment Total Investment_SMA_3 Total Investment_SMA_5 Total Investment_EMA_8 Total_Investment_Ad_Stock TV TV_SMA_3 TV_SMA_5 TV_EMA_8 TV_Ad_Stock Digital Digital_SMA_3 Digital_SMA_5 Digital_EMA_8 Digital_Ad_Stock Sponsorship Sponsorship_SMA_3 Sponsorship_SMA_5 Sponsorship_EMA_8 Sponsorship_Ad_Stock Content Marketing Content Marketing_SMA_3 Content Marketing_SMA_5 Content Marketing_EMA_8 Content_Marketing_Ad_Stock Online marketing Online marketing_SMA_3 Online marketing_SMA_5 Online marketing_EMA_8 Online_marketing_Ad_Stock Affiliates Affiliates_SMA_3 Affiliates_SMA_5 Affiliates_EMA_8 Affiliates_Ad_Stock SEM SEM_SMA_3 SEM_SMA_5 SEM_EMA_8 SEM_Ad_Stock Radio Radio_SMA_3 Radio_SMA_5 Radio_EMA_8 Radio_Ad_Stock Other Other_SMA_3 Other_SMA_5 Other_EMA_8 Other_Ad_Stock NPS NPS_SMA_3 NPS_SMA_5 Stock Index Stock Index_SMA_3 Stock Index_SMA_5 Max Temp Min Temp Mean Temp Heat Deg Days Cool Deg Days Total Rain (mm) Total Snow (cm) Total Precip (mm) Snow on Grnd (cm) Sale
25 28 4573783.133 31.451 0.000 0.000 7.369 2.863 1583 1366 8 33 1 516.000 23 1374.000 0 0 0 63 0 0 4.265 0.000 0.000 4.265 4.265 0.054 0.000 0.000 0.054 0.054 0.633 0.000 0.000 0.633 0.633 1.854 0.000 0.000 1.854 1.854 0.000 0.000 0.000 0.000 0.000 0.332 0.000 0.000 0.332 0.332 0.137 0.000 0.000 0.137 0.137 1.256 0.000 0.000 1.256 1.256 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 0.000 0.000 1177.000 0.000 0.000 28.000 12.500 20.100 0.283 2.383 4.417 0.000 4.417 0.000 0
26 29 5371525.000 32.967 0.000 0.000 6.985 2.746 1868 1610 7 50 1 574.000 42 1623.000 0 0 0 69 1 0 4.265 0.000 0.000 4.265 6.824 0.054 0.000 0.000 0.054 0.086 0.633 0.000 0.000 0.633 1.013 1.854 0.000 0.000 1.854 2.966 0.000 0.000 0.000 0.000 0.000 0.332 0.000 0.000 0.332 0.531 0.137 0.000 0.000 0.137 0.219 1.256 0.000 0.000 1.256 2.010 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 0.000 0.000 1177.000 0.000 0.000 33.000 11.000 23.183 0.000 5.183 1.400 0.000 1.400 0.000 2
27 30 4679828.000 32.357 0.000 0.000 7.072 2.861 1758 1569 4 56 0 577.000 36 1430.000 0 0 0 46 0 0 4.265 4.265 0.000 4.265 8.359 0.054 0.054 0.000 0.054 0.106 0.633 0.633 0.000 0.633 1.241 1.854 1.854 0.000 1.854 3.634 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.000 0.332 0.651 0.137 0.137 0.000 0.137 0.269 1.256 1.256 0.000 1.256 2.462 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 54.600 0.000 1177.000 1177.000 0.000 31.500 14.500 23.060 0.000 5.060 1.080 0.000 1.080 0.000 0
28 31 3451151.000 32.208 0.000 0.000 7.201 2.735 1244 1072 2 43 0 420.000 20 1025.000 0 0 0 44 1 0 4.265 4.265 0.000 4.265 9.281 0.054 0.054 0.000 0.054 0.118 0.633 0.633 0.000 0.633 1.377 1.854 1.854 0.000 1.854 4.034 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.000 0.332 0.722 0.137 0.137 0.000 0.137 0.298 1.256 1.256 0.000 1.256 2.733 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 54.600 0.000 1177.000 1177.000 0.000 33.500 16.000 24.567 0.000 6.567 4.633 0.000 4.633 0.000 0
29 32 2599.000 16.130 0.000 0.000 9.000 2.000 0 0 0 0 0 0.000 0 1.000 0 0 0 0 0 0 1.013 3.181 3.615 3.542 6.581 0.001 0.036 0.043 0.042 0.072 0.256 0.507 0.558 0.549 1.082 0.213 1.307 1.526 1.489 2.634 0.000 0.000 0.000 0.000 0.000 0.026 0.230 0.271 0.264 0.459 0.015 0.096 0.113 0.110 0.194 0.503 1.005 1.105 1.089 2.143 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 59.987 56.395 55.677 1206.000 1186.667 1182.800 28.500 15.000 21.650 0.000 3.650 0.350 0.000 0.350 0.000 0
In [340]:
# Making copy of dataframes from the original ones
cameraaccessory_dladd_df = cameraaccessory_org_df.copy()
gamingaccessory_dladd_df = gamingaccessory_org_df.copy()
homeaudio_dladd_df = homeaudio_org_df.copy()
homeaudio_dladd_df.head()
Out[340]:
Week gmv Discount% deliverybdays deliverycdays sla product_procurement_sla is_cod is_mass_market product_vertical_djcontroller product_vertical_dock product_vertical_dockingstation product_vertical_fmradio product_vertical_hifisystem product_vertical_homeaudiospeaker product_vertical_karaokeplayer product_vertical_slingbox product_vertical_soundmixer product_vertical_voicerecorder payday_week holiday_week Total Investment Total Investment_SMA_3 Total Investment_SMA_5 Total Investment_EMA_8 Total_Investment_Ad_Stock TV TV_SMA_3 TV_SMA_5 TV_EMA_8 TV_Ad_Stock Digital Digital_SMA_3 Digital_SMA_5 Digital_EMA_8 Digital_Ad_Stock Sponsorship Sponsorship_SMA_3 Sponsorship_SMA_5 Sponsorship_EMA_8 Sponsorship_Ad_Stock Content Marketing Content Marketing_SMA_3 Content Marketing_SMA_5 Content Marketing_EMA_8 Content_Marketing_Ad_Stock Online marketing Online marketing_SMA_3 Online marketing_SMA_5 Online marketing_EMA_8 Online_marketing_Ad_Stock Affiliates Affiliates_SMA_3 Affiliates_SMA_5 Affiliates_EMA_8 Affiliates_Ad_Stock SEM SEM_SMA_3 SEM_SMA_5 SEM_EMA_8 SEM_Ad_Stock Radio Radio_SMA_3 Radio_SMA_5 Radio_EMA_8 Radio_Ad_Stock Other Other_SMA_3 Other_SMA_5 Other_EMA_8 Other_Ad_Stock NPS NPS_SMA_3 NPS_SMA_5 Stock Index Stock Index_SMA_3 Stock Index_SMA_5 Max Temp Min Temp Mean Temp Heat Deg Days Cool Deg Days Total Rain (mm) Total Snow (cm) Total Precip (mm) Snow on Grnd (cm) Sale
25 28 4573783.133 31.451 0.000 0.000 7.369 2.863 1583 1366 8 33 1 516.000 23 1374.000 0 0 0 63 0 0 4.265 0.000 0.000 4.265 4.265 0.054 0.000 0.000 0.054 0.054 0.633 0.000 0.000 0.633 0.633 1.854 0.000 0.000 1.854 1.854 0.000 0.000 0.000 0.000 0.000 0.332 0.000 0.000 0.332 0.332 0.137 0.000 0.000 0.137 0.137 1.256 0.000 0.000 1.256 1.256 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 0.000 0.000 1177.000 0.000 0.000 28.000 12.500 20.100 0.283 2.383 4.417 0.000 4.417 0.000 0
26 29 5371525.000 32.967 0.000 0.000 6.985 2.746 1868 1610 7 50 1 574.000 42 1623.000 0 0 0 69 1 0 4.265 0.000 0.000 4.265 6.824 0.054 0.000 0.000 0.054 0.086 0.633 0.000 0.000 0.633 1.013 1.854 0.000 0.000 1.854 2.966 0.000 0.000 0.000 0.000 0.000 0.332 0.000 0.000 0.332 0.531 0.137 0.000 0.000 0.137 0.219 1.256 0.000 0.000 1.256 2.010 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 0.000 0.000 1177.000 0.000 0.000 33.000 11.000 23.183 0.000 5.183 1.400 0.000 1.400 0.000 2
27 30 4679828.000 32.357 0.000 0.000 7.072 2.861 1758 1569 4 56 0 577.000 36 1430.000 0 0 0 46 0 0 4.265 4.265 0.000 4.265 8.359 0.054 0.054 0.000 0.054 0.106 0.633 0.633 0.000 0.633 1.241 1.854 1.854 0.000 1.854 3.634 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.000 0.332 0.651 0.137 0.137 0.000 0.137 0.269 1.256 1.256 0.000 1.256 2.462 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 54.600 0.000 1177.000 1177.000 0.000 31.500 14.500 23.060 0.000 5.060 1.080 0.000 1.080 0.000 0
28 31 3451151.000 32.208 0.000 0.000 7.201 2.735 1244 1072 2 43 0 420.000 20 1025.000 0 0 0 44 1 0 4.265 4.265 0.000 4.265 9.281 0.054 0.054 0.000 0.054 0.118 0.633 0.633 0.000 0.633 1.377 1.854 1.854 0.000 1.854 4.034 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.000 0.332 0.722 0.137 0.137 0.000 0.137 0.298 1.256 1.256 0.000 1.256 2.733 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 54.600 0.000 1177.000 1177.000 0.000 33.500 16.000 24.567 0.000 6.567 4.633 0.000 4.633 0.000 0
29 32 2599.000 16.130 0.000 0.000 9.000 2.000 0 0 0 0 0 0.000 0 1.000 0 0 0 0 0 0 1.013 3.181 3.615 3.542 6.581 0.001 0.036 0.043 0.042 0.072 0.256 0.507 0.558 0.549 1.082 0.213 1.307 1.526 1.489 2.634 0.000 0.000 0.000 0.000 0.000 0.026 0.230 0.271 0.264 0.459 0.015 0.096 0.113 0.110 0.194 0.503 1.005 1.105 1.089 2.143 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 59.987 56.395 55.677 1206.000 1186.667 1182.800 28.500 15.000 21.650 0.000 3.650 0.350 0.000 0.350 0.000 0
In [341]:
# Checking for total count and percentage of null values in all columns of the dataframe.

total = pd.DataFrame(homeaudio_dladd_df.isnull().sum().sort_values(ascending=False), columns=['Total'])
percentage = pd.DataFrame(round(100*(homeaudio_dladd_df.isnull().sum()/homeaudio_dladd_df.shape[0]),2).sort_values(ascending=False)\
                          ,columns=['Percentage'])

pd.concat([total, percentage], axis = 1).head()
Out[341]:
Total Percentage
Sale 0 0.000
Digital 0 0.000
Total Investment_SMA_5 0 0.000
Total Investment_EMA_8 0 0.000
Total_Investment_Ad_Stock 0 0.000

We will drop the Week column as it is a row identifier and will not help in prediction of revenue

In [342]:
# removing columns
cameraaccessory_dladd_df = cameraaccessory_dladd_df.drop('Week', axis=1)
gamingaccessory_dladd_df = gamingaccessory_dladd_df.drop('Week', axis=1)
homeaudio_dladd_df = homeaudio_dladd_df.drop('Week', axis=1)
homeaudio_dladd_df.head()
Out[342]:
gmv Discount% deliverybdays deliverycdays sla product_procurement_sla is_cod is_mass_market product_vertical_djcontroller product_vertical_dock product_vertical_dockingstation product_vertical_fmradio product_vertical_hifisystem product_vertical_homeaudiospeaker product_vertical_karaokeplayer product_vertical_slingbox product_vertical_soundmixer product_vertical_voicerecorder payday_week holiday_week Total Investment Total Investment_SMA_3 Total Investment_SMA_5 Total Investment_EMA_8 Total_Investment_Ad_Stock TV TV_SMA_3 TV_SMA_5 TV_EMA_8 TV_Ad_Stock Digital Digital_SMA_3 Digital_SMA_5 Digital_EMA_8 Digital_Ad_Stock Sponsorship Sponsorship_SMA_3 Sponsorship_SMA_5 Sponsorship_EMA_8 Sponsorship_Ad_Stock Content Marketing Content Marketing_SMA_3 Content Marketing_SMA_5 Content Marketing_EMA_8 Content_Marketing_Ad_Stock Online marketing Online marketing_SMA_3 Online marketing_SMA_5 Online marketing_EMA_8 Online_marketing_Ad_Stock Affiliates Affiliates_SMA_3 Affiliates_SMA_5 Affiliates_EMA_8 Affiliates_Ad_Stock SEM SEM_SMA_3 SEM_SMA_5 SEM_EMA_8 SEM_Ad_Stock Radio Radio_SMA_3 Radio_SMA_5 Radio_EMA_8 Radio_Ad_Stock Other Other_SMA_3 Other_SMA_5 Other_EMA_8 Other_Ad_Stock NPS NPS_SMA_3 NPS_SMA_5 Stock Index Stock Index_SMA_3 Stock Index_SMA_5 Max Temp Min Temp Mean Temp Heat Deg Days Cool Deg Days Total Rain (mm) Total Snow (cm) Total Precip (mm) Snow on Grnd (cm) Sale
25 4573783.133 31.451 0.000 0.000 7.369 2.863 1583 1366 8 33 1 516.000 23 1374.000 0 0 0 63 0 0 4.265 0.000 0.000 4.265 4.265 0.054 0.000 0.000 0.054 0.054 0.633 0.000 0.000 0.633 0.633 1.854 0.000 0.000 1.854 1.854 0.000 0.000 0.000 0.000 0.000 0.332 0.000 0.000 0.332 0.332 0.137 0.000 0.000 0.137 0.137 1.256 0.000 0.000 1.256 1.256 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 0.000 0.000 1177.000 0.000 0.000 28.000 12.500 20.100 0.283 2.383 4.417 0.000 4.417 0.000 0
26 5371525.000 32.967 0.000 0.000 6.985 2.746 1868 1610 7 50 1 574.000 42 1623.000 0 0 0 69 1 0 4.265 0.000 0.000 4.265 6.824 0.054 0.000 0.000 0.054 0.086 0.633 0.000 0.000 0.633 1.013 1.854 0.000 0.000 1.854 2.966 0.000 0.000 0.000 0.000 0.000 0.332 0.000 0.000 0.332 0.531 0.137 0.000 0.000 0.137 0.219 1.256 0.000 0.000 1.256 2.010 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 0.000 0.000 1177.000 0.000 0.000 33.000 11.000 23.183 0.000 5.183 1.400 0.000 1.400 0.000 2
27 4679828.000 32.357 0.000 0.000 7.072 2.861 1758 1569 4 56 0 577.000 36 1430.000 0 0 0 46 0 0 4.265 4.265 0.000 4.265 8.359 0.054 0.054 0.000 0.054 0.106 0.633 0.633 0.000 0.633 1.241 1.854 1.854 0.000 1.854 3.634 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.000 0.332 0.651 0.137 0.137 0.000 0.137 0.269 1.256 1.256 0.000 1.256 2.462 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 54.600 0.000 1177.000 1177.000 0.000 31.500 14.500 23.060 0.000 5.060 1.080 0.000 1.080 0.000 0
28 3451151.000 32.208 0.000 0.000 7.201 2.735 1244 1072 2 43 0 420.000 20 1025.000 0 0 0 44 1 0 4.265 4.265 0.000 4.265 9.281 0.054 0.054 0.000 0.054 0.118 0.633 0.633 0.000 0.633 1.377 1.854 1.854 0.000 1.854 4.034 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.000 0.332 0.722 0.137 0.137 0.000 0.137 0.298 1.256 1.256 0.000 1.256 2.733 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 54.600 0.000 1177.000 1177.000 0.000 33.500 16.000 24.567 0.000 6.567 4.633 0.000 4.633 0.000 0
29 2599.000 16.130 0.000 0.000 9.000 2.000 0 0 0 0 0 0.000 0 1.000 0 0 0 0 0 0 1.013 3.181 3.615 3.542 6.581 0.001 0.036 0.043 0.042 0.072 0.256 0.507 0.558 0.549 1.082 0.213 1.307 1.526 1.489 2.634 0.000 0.000 0.000 0.000 0.000 0.026 0.230 0.271 0.264 0.459 0.015 0.096 0.113 0.110 0.194 0.503 1.005 1.105 1.089 2.143 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 59.987 56.395 55.677 1206.000 1186.667 1182.800 28.500 15.000 21.650 0.000 3.650 0.350 0.000 0.350 0.000 0

Creating new lag(Lag of 3 weeks) variables for the dependent variable (GMV) as well as independent variables

In [343]:
cameraaccessory_dladd_df_columns = cameraaccessory_dladd_df.columns
gamingaccessory_dladd_df_columns = gamingaccessory_dladd_df.columns
homeaudio_dladd_df_columns = homeaudio_dladd_df.columns
In [344]:
cameraaccessory_dladd_df = lag_variables(cameraaccessory_dladd_df,cameraaccessory_dladd_df_columns,3)
gamingaccessory_dladd_df = lag_variables(gamingaccessory_dladd_df,gamingaccessory_dladd_df_columns,3)
homeaudio_dladd_df = lag_variables(homeaudio_dladd_df,homeaudio_dladd_df_columns,3)
homeaudio_dladd_df.head()
Out[344]:
gmv gmv_lag3 Discount% Discount%_lag3 deliverybdays deliverybdays_lag3 deliverycdays deliverycdays_lag3 sla sla_lag3 product_procurement_sla product_procurement_sla_lag3 is_cod is_cod_lag3 is_mass_market is_mass_market_lag3 product_vertical_djcontroller product_vertical_djcontroller_lag3 product_vertical_dock product_vertical_dock_lag3 product_vertical_dockingstation product_vertical_dockingstation_lag3 product_vertical_fmradio product_vertical_fmradio_lag3 product_vertical_hifisystem product_vertical_hifisystem_lag3 product_vertical_homeaudiospeaker product_vertical_homeaudiospeaker_lag3 product_vertical_karaokeplayer product_vertical_karaokeplayer_lag3 product_vertical_slingbox product_vertical_slingbox_lag3 product_vertical_soundmixer product_vertical_soundmixer_lag3 product_vertical_voicerecorder product_vertical_voicerecorder_lag3 payday_week payday_week_lag3 holiday_week holiday_week_lag3 Total Investment Total Investment_lag3 Total Investment_SMA_3 Total Investment_SMA_3_lag3 Total Investment_SMA_5 Total Investment_SMA_5_lag3 Total Investment_EMA_8 Total Investment_EMA_8_lag3 Total_Investment_Ad_Stock Total_Investment_Ad_Stock_lag3 TV TV_lag3 TV_SMA_3 TV_SMA_3_lag3 TV_SMA_5 TV_SMA_5_lag3 TV_EMA_8 TV_EMA_8_lag3 TV_Ad_Stock TV_Ad_Stock_lag3 Digital Digital_lag3 Digital_SMA_3 Digital_SMA_3_lag3 Digital_SMA_5 Digital_SMA_5_lag3 Digital_EMA_8 Digital_EMA_8_lag3 Digital_Ad_Stock Digital_Ad_Stock_lag3 Sponsorship Sponsorship_lag3 Sponsorship_SMA_3 Sponsorship_SMA_3_lag3 Sponsorship_SMA_5 Sponsorship_SMA_5_lag3 Sponsorship_EMA_8 Sponsorship_EMA_8_lag3 Sponsorship_Ad_Stock Sponsorship_Ad_Stock_lag3 Content Marketing Content Marketing_lag3 Content Marketing_SMA_3 Content Marketing_SMA_3_lag3 Content Marketing_SMA_5 Content Marketing_SMA_5_lag3 Content Marketing_EMA_8 Content Marketing_EMA_8_lag3 Content_Marketing_Ad_Stock Content_Marketing_Ad_Stock_lag3 Online marketing Online marketing_lag3 Online marketing_SMA_3 Online marketing_SMA_3_lag3 Online marketing_SMA_5 Online marketing_SMA_5_lag3 Online marketing_EMA_8 Online marketing_EMA_8_lag3 Online_marketing_Ad_Stock Online_marketing_Ad_Stock_lag3 Affiliates Affiliates_lag3 Affiliates_SMA_3 Affiliates_SMA_3_lag3 Affiliates_SMA_5 Affiliates_SMA_5_lag3 Affiliates_EMA_8 Affiliates_EMA_8_lag3 Affiliates_Ad_Stock Affiliates_Ad_Stock_lag3 SEM SEM_lag3 SEM_SMA_3 SEM_SMA_3_lag3 SEM_SMA_5 SEM_SMA_5_lag3 SEM_EMA_8 SEM_EMA_8_lag3 SEM_Ad_Stock SEM_Ad_Stock_lag3 Radio Radio_lag3 Radio_SMA_3 Radio_SMA_3_lag3 Radio_SMA_5 Radio_SMA_5_lag3 Radio_EMA_8 Radio_EMA_8_lag3 Radio_Ad_Stock Radio_Ad_Stock_lag3 Other Other_lag3 Other_SMA_3 Other_SMA_3_lag3 Other_SMA_5 Other_SMA_5_lag3 Other_EMA_8 Other_EMA_8_lag3 Other_Ad_Stock Other_Ad_Stock_lag3 NPS NPS_lag3 NPS_SMA_3 NPS_SMA_3_lag3 NPS_SMA_5 NPS_SMA_5_lag3 Stock Index Stock Index_lag3 Stock Index_SMA_3 Stock Index_SMA_3_lag3 Stock Index_SMA_5 Stock Index_SMA_5_lag3 Max Temp Max Temp_lag3 Min Temp Min Temp_lag3 Mean Temp Mean Temp_lag3 Heat Deg Days Heat Deg Days_lag3 Cool Deg Days Cool Deg Days_lag3 Total Rain (mm) Total Rain (mm)_lag3 Total Snow (cm) Total Snow (cm)_lag3 Total Precip (mm) Total Precip (mm)_lag3 Snow on Grnd (cm) Snow on Grnd (cm)_lag3 Sale Sale_lag3
25 4573783.133 nan 31.451 nan 0.000 nan 0.000 nan 7.369 nan 2.863 nan 1583 nan 1366 nan 8 nan 33 nan 1 nan 516.000 nan 23 nan 1374.000 nan 0 nan 0 nan 0 nan 63 nan 0 nan 0 nan 4.265 nan 0.000 nan 0.000 nan 4.265 nan 4.265 nan 0.054 nan 0.000 nan 0.000 nan 0.054 nan 0.054 nan 0.633 nan 0.000 nan 0.000 nan 0.633 nan 0.633 nan 1.854 nan 0.000 nan 0.000 nan 1.854 nan 1.854 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.332 nan 0.000 nan 0.000 nan 0.332 nan 0.332 nan 0.137 nan 0.000 nan 0.000 nan 0.137 nan 0.137 nan 1.256 nan 0.000 nan 0.000 nan 1.256 nan 1.256 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 54.600 nan 0.000 nan 0.000 nan 1177.000 nan 0.000 nan 0.000 nan 28.000 nan 12.500 nan 20.100 nan 0.283 nan 2.383 nan 4.417 nan 0.000 nan 4.417 nan 0.000 nan 0 nan
26 5371525.000 nan 32.967 nan 0.000 nan 0.000 nan 6.985 nan 2.746 nan 1868 nan 1610 nan 7 nan 50 nan 1 nan 574.000 nan 42 nan 1623.000 nan 0 nan 0 nan 0 nan 69 nan 1 nan 0 nan 4.265 nan 0.000 nan 0.000 nan 4.265 nan 6.824 nan 0.054 nan 0.000 nan 0.000 nan 0.054 nan 0.086 nan 0.633 nan 0.000 nan 0.000 nan 0.633 nan 1.013 nan 1.854 nan 0.000 nan 0.000 nan 1.854 nan 2.966 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.332 nan 0.000 nan 0.000 nan 0.332 nan 0.531 nan 0.137 nan 0.000 nan 0.000 nan 0.137 nan 0.219 nan 1.256 nan 0.000 nan 0.000 nan 1.256 nan 2.010 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 54.600 nan 0.000 nan 0.000 nan 1177.000 nan 0.000 nan 0.000 nan 33.000 nan 11.000 nan 23.183 nan 0.000 nan 5.183 nan 1.400 nan 0.000 nan 1.400 nan 0.000 nan 2 nan
27 4679828.000 nan 32.357 nan 0.000 nan 0.000 nan 7.072 nan 2.861 nan 1758 nan 1569 nan 4 nan 56 nan 0 nan 577.000 nan 36 nan 1430.000 nan 0 nan 0 nan 0 nan 46 nan 0 nan 0 nan 4.265 nan 4.265 nan 0.000 nan 4.265 nan 8.359 nan 0.054 nan 0.054 nan 0.000 nan 0.054 nan 0.106 nan 0.633 nan 0.633 nan 0.000 nan 0.633 nan 1.241 nan 1.854 nan 1.854 nan 0.000 nan 1.854 nan 3.634 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.332 nan 0.332 nan 0.000 nan 0.332 nan 0.651 nan 0.137 nan 0.137 nan 0.000 nan 0.137 nan 0.269 nan 1.256 nan 1.256 nan 0.000 nan 1.256 nan 2.462 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 54.600 nan 54.600 nan 0.000 nan 1177.000 nan 1177.000 nan 0.000 nan 31.500 nan 14.500 nan 23.060 nan 0.000 nan 5.060 nan 1.080 nan 0.000 nan 1.080 nan 0.000 nan 0 nan
28 3451151.000 4573783.133 32.208 31.451 0.000 0.000 0.000 0.000 7.201 7.369 2.735 2.863 1244 1583.000 1072 1366.000 2 8.000 43 33.000 0 1.000 420.000 516.000 20 23.000 1025.000 1374.000 0 0.000 0 0.000 0 0.000 44 63.000 1 0.000 0 0.000 4.265 4.265 4.265 0.000 0.000 0.000 4.265 4.265 9.281 4.265 0.054 0.054 0.054 0.000 0.000 0.000 0.054 0.054 0.118 0.054 0.633 0.633 0.633 0.000 0.000 0.000 0.633 0.633 1.377 0.633 1.854 1.854 1.854 0.000 0.000 0.000 1.854 1.854 4.034 1.854 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.332 0.000 0.000 0.000 0.332 0.332 0.722 0.332 0.137 0.137 0.137 0.000 0.000 0.000 0.137 0.137 0.298 0.137 1.256 1.256 1.256 0.000 0.000 0.000 1.256 1.256 2.733 1.256 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 54.600 54.600 0.000 0.000 0.000 1177.000 1177.000 1177.000 0.000 0.000 0.000 33.500 28.000 16.000 12.500 24.567 20.100 0.000 0.283 6.567 2.383 4.633 4.417 0.000 0.000 4.633 4.417 0.000 0.000 0 0.000
29 2599.000 5371525.000 16.130 32.967 0.000 0.000 0.000 0.000 9.000 6.985 2.000 2.746 0 1868.000 0 1610.000 0 7.000 0 50.000 0 1.000 0.000 574.000 0 42.000 1.000 1623.000 0 0.000 0 0.000 0 0.000 0 69.000 0 1.000 0 0.000 1.013 4.265 3.181 0.000 3.615 0.000 3.542 4.265 6.581 6.824 0.001 0.054 0.036 0.000 0.043 0.000 0.042 0.054 0.072 0.086 0.256 0.633 0.507 0.000 0.558 0.000 0.549 0.633 1.082 1.013 0.213 1.854 1.307 0.000 1.526 0.000 1.489 1.854 2.634 2.966 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.026 0.332 0.230 0.000 0.271 0.000 0.264 0.332 0.459 0.531 0.015 0.137 0.096 0.000 0.113 0.000 0.110 0.137 0.194 0.219 0.503 1.256 1.005 0.000 1.105 0.000 1.089 1.256 2.143 2.010 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 59.987 54.600 56.395 0.000 55.677 0.000 1206.000 1177.000 1186.667 0.000 1182.800 0.000 28.500 33.000 15.000 11.000 21.650 23.183 0.000 0.000 3.650 5.183 0.350 1.400 0.000 0.000 0.350 1.400 0.000 0.000 0 2.000

Creating new lag(Lag of 2 weeks) variables for the dependent variable (GMV) as well as independent variables

In [345]:
cameraaccessory_dladd_df = lag_variables(cameraaccessory_dladd_df,cameraaccessory_dladd_df_columns,2)
gamingaccessory_dladd_df = lag_variables(gamingaccessory_dladd_df,gamingaccessory_dladd_df_columns,2)
homeaudio_dladd_df = lag_variables(homeaudio_dladd_df,homeaudio_dladd_df_columns,2)
homeaudio_dladd_df.head()
Out[345]:
gmv gmv_lag2 gmv_lag3 Discount% Discount%_lag2 Discount%_lag3 deliverybdays deliverybdays_lag2 deliverybdays_lag3 deliverycdays deliverycdays_lag2 deliverycdays_lag3 sla sla_lag2 sla_lag3 product_procurement_sla product_procurement_sla_lag2 product_procurement_sla_lag3 is_cod is_cod_lag2 is_cod_lag3 is_mass_market is_mass_market_lag2 is_mass_market_lag3 product_vertical_djcontroller product_vertical_djcontroller_lag2 product_vertical_djcontroller_lag3 product_vertical_dock product_vertical_dock_lag2 product_vertical_dock_lag3 product_vertical_dockingstation product_vertical_dockingstation_lag2 product_vertical_dockingstation_lag3 product_vertical_fmradio product_vertical_fmradio_lag2 product_vertical_fmradio_lag3 product_vertical_hifisystem product_vertical_hifisystem_lag2 product_vertical_hifisystem_lag3 product_vertical_homeaudiospeaker product_vertical_homeaudiospeaker_lag2 product_vertical_homeaudiospeaker_lag3 product_vertical_karaokeplayer product_vertical_karaokeplayer_lag2 product_vertical_karaokeplayer_lag3 product_vertical_slingbox product_vertical_slingbox_lag2 product_vertical_slingbox_lag3 product_vertical_soundmixer product_vertical_soundmixer_lag2 product_vertical_soundmixer_lag3 product_vertical_voicerecorder product_vertical_voicerecorder_lag2 product_vertical_voicerecorder_lag3 payday_week payday_week_lag2 payday_week_lag3 holiday_week holiday_week_lag2 holiday_week_lag3 Total Investment Total Investment_lag2 Total Investment_lag3 Total Investment_SMA_3 Total Investment_SMA_3_lag2 Total Investment_SMA_3_lag3 Total Investment_SMA_5 Total Investment_SMA_5_lag2 Total Investment_SMA_5_lag3 Total Investment_EMA_8 Total Investment_EMA_8_lag2 Total Investment_EMA_8_lag3 Total_Investment_Ad_Stock Total_Investment_Ad_Stock_lag2 Total_Investment_Ad_Stock_lag3 TV TV_lag2 TV_lag3 TV_SMA_3 TV_SMA_3_lag2 TV_SMA_3_lag3 TV_SMA_5 TV_SMA_5_lag2 TV_SMA_5_lag3 TV_EMA_8 TV_EMA_8_lag2 TV_EMA_8_lag3 TV_Ad_Stock TV_Ad_Stock_lag2 TV_Ad_Stock_lag3 Digital Digital_lag2 Digital_lag3 Digital_SMA_3 Digital_SMA_3_lag2 Digital_SMA_3_lag3 Digital_SMA_5 Digital_SMA_5_lag2 Digital_SMA_5_lag3 Digital_EMA_8 Digital_EMA_8_lag2 Digital_EMA_8_lag3 Digital_Ad_Stock Digital_Ad_Stock_lag2 Digital_Ad_Stock_lag3 Sponsorship Sponsorship_lag2 Sponsorship_lag3 Sponsorship_SMA_3 Sponsorship_SMA_3_lag2 Sponsorship_SMA_3_lag3 Sponsorship_SMA_5 Sponsorship_SMA_5_lag2 Sponsorship_SMA_5_lag3 Sponsorship_EMA_8 Sponsorship_EMA_8_lag2 Sponsorship_EMA_8_lag3 Sponsorship_Ad_Stock Sponsorship_Ad_Stock_lag2 Sponsorship_Ad_Stock_lag3 Content Marketing Content Marketing_lag2 Content Marketing_lag3 Content Marketing_SMA_3 Content Marketing_SMA_3_lag2 Content Marketing_SMA_3_lag3 Content Marketing_SMA_5 Content Marketing_SMA_5_lag2 Content Marketing_SMA_5_lag3 Content Marketing_EMA_8 Content Marketing_EMA_8_lag2 Content Marketing_EMA_8_lag3 Content_Marketing_Ad_Stock Content_Marketing_Ad_Stock_lag2 Content_Marketing_Ad_Stock_lag3 Online marketing Online marketing_lag2 Online marketing_lag3 Online marketing_SMA_3 Online marketing_SMA_3_lag2 Online marketing_SMA_3_lag3 Online marketing_SMA_5 Online marketing_SMA_5_lag2 Online marketing_SMA_5_lag3 Online marketing_EMA_8 Online marketing_EMA_8_lag2 Online marketing_EMA_8_lag3 Online_marketing_Ad_Stock Online_marketing_Ad_Stock_lag2 Online_marketing_Ad_Stock_lag3 Affiliates Affiliates_lag2 Affiliates_lag3 Affiliates_SMA_3 Affiliates_SMA_3_lag2 Affiliates_SMA_3_lag3 Affiliates_SMA_5 Affiliates_SMA_5_lag2 Affiliates_SMA_5_lag3 Affiliates_EMA_8 Affiliates_EMA_8_lag2 Affiliates_EMA_8_lag3 Affiliates_Ad_Stock Affiliates_Ad_Stock_lag2 Affiliates_Ad_Stock_lag3 SEM SEM_lag2 SEM_lag3 SEM_SMA_3 SEM_SMA_3_lag2 SEM_SMA_3_lag3 SEM_SMA_5 SEM_SMA_5_lag2 SEM_SMA_5_lag3 SEM_EMA_8 SEM_EMA_8_lag2 SEM_EMA_8_lag3 SEM_Ad_Stock SEM_Ad_Stock_lag2 SEM_Ad_Stock_lag3 Radio Radio_lag2 Radio_lag3 Radio_SMA_3 Radio_SMA_3_lag2 Radio_SMA_3_lag3 Radio_SMA_5 Radio_SMA_5_lag2 Radio_SMA_5_lag3 Radio_EMA_8 Radio_EMA_8_lag2 Radio_EMA_8_lag3 Radio_Ad_Stock Radio_Ad_Stock_lag2 Radio_Ad_Stock_lag3 Other Other_lag2 Other_lag3 Other_SMA_3 Other_SMA_3_lag2 Other_SMA_3_lag3 Other_SMA_5 Other_SMA_5_lag2 Other_SMA_5_lag3 Other_EMA_8 Other_EMA_8_lag2 Other_EMA_8_lag3 Other_Ad_Stock Other_Ad_Stock_lag2 Other_Ad_Stock_lag3 NPS NPS_lag2 NPS_lag3 NPS_SMA_3 NPS_SMA_3_lag2 NPS_SMA_3_lag3 NPS_SMA_5 NPS_SMA_5_lag2 NPS_SMA_5_lag3 Stock Index Stock Index_lag2 Stock Index_lag3 Stock Index_SMA_3 Stock Index_SMA_3_lag2 Stock Index_SMA_3_lag3 Stock Index_SMA_5 Stock Index_SMA_5_lag2 Stock Index_SMA_5_lag3 Max Temp Max Temp_lag2 Max Temp_lag3 Min Temp Min Temp_lag2 Min Temp_lag3 Mean Temp Mean Temp_lag2 Mean Temp_lag3 Heat Deg Days Heat Deg Days_lag2 Heat Deg Days_lag3 Cool Deg Days Cool Deg Days_lag2 Cool Deg Days_lag3 Total Rain (mm) Total Rain (mm)_lag2 Total Rain (mm)_lag3 Total Snow (cm) Total Snow (cm)_lag2 Total Snow (cm)_lag3 Total Precip (mm) Total Precip (mm)_lag2 Total Precip (mm)_lag3 Snow on Grnd (cm) Snow on Grnd (cm)_lag2 Snow on Grnd (cm)_lag3 Sale Sale_lag2 Sale_lag3
25 4573783.133 nan nan 31.451 nan nan 0.000 nan nan 0.000 nan nan 7.369 nan nan 2.863 nan nan 1583 nan nan 1366 nan nan 8 nan nan 33 nan nan 1 nan nan 516.000 nan nan 23 nan nan 1374.000 nan nan 0 nan nan 0 nan nan 0 nan nan 63 nan nan 0 nan nan 0 nan nan 4.265 nan nan 0.000 nan nan 0.000 nan nan 4.265 nan nan 4.265 nan nan 0.054 nan nan 0.000 nan nan 0.000 nan nan 0.054 nan nan 0.054 nan nan 0.633 nan nan 0.000 nan nan 0.000 nan nan 0.633 nan nan 0.633 nan nan 1.854 nan nan 0.000 nan nan 0.000 nan nan 1.854 nan nan 1.854 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 0.332 nan nan 0.000 nan nan 0.000 nan nan 0.332 nan nan 0.332 nan nan 0.137 nan nan 0.000 nan nan 0.000 nan nan 0.137 nan nan 0.137 nan nan 1.256 nan nan 0.000 nan nan 0.000 nan nan 1.256 nan nan 1.256 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 54.600 nan nan 0.000 nan nan 0.000 nan nan 1177.000 nan nan 0.000 nan nan 0.000 nan nan 28.000 nan nan 12.500 nan nan 20.100 nan nan 0.283 nan nan 2.383 nan nan 4.417 nan nan 0.000 nan nan 4.417 nan nan 0.000 nan nan 0 nan nan
26 5371525.000 nan nan 32.967 nan nan 0.000 nan nan 0.000 nan nan 6.985 nan nan 2.746 nan nan 1868 nan nan 1610 nan nan 7 nan nan 50 nan nan 1 nan nan 574.000 nan nan 42 nan nan 1623.000 nan nan 0 nan nan 0 nan nan 0 nan nan 69 nan nan 1 nan nan 0 nan nan 4.265 nan nan 0.000 nan nan 0.000 nan nan 4.265 nan nan 6.824 nan nan 0.054 nan nan 0.000 nan nan 0.000 nan nan 0.054 nan nan 0.086 nan nan 0.633 nan nan 0.000 nan nan 0.000 nan nan 0.633 nan nan 1.013 nan nan 1.854 nan nan 0.000 nan nan 0.000 nan nan 1.854 nan nan 2.966 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 0.332 nan nan 0.000 nan nan 0.000 nan nan 0.332 nan nan 0.531 nan nan 0.137 nan nan 0.000 nan nan 0.000 nan nan 0.137 nan nan 0.219 nan nan 1.256 nan nan 0.000 nan nan 0.000 nan nan 1.256 nan nan 2.010 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 54.600 nan nan 0.000 nan nan 0.000 nan nan 1177.000 nan nan 0.000 nan nan 0.000 nan nan 33.000 nan nan 11.000 nan nan 23.183 nan nan 0.000 nan nan 5.183 nan nan 1.400 nan nan 0.000 nan nan 1.400 nan nan 0.000 nan nan 2 nan nan
27 4679828.000 4573783.133 nan 32.357 31.451 nan 0.000 0.000 nan 0.000 0.000 nan 7.072 7.369 nan 2.861 2.863 nan 1758 1583.000 nan 1569 1366.000 nan 4 8.000 nan 56 33.000 nan 0 1.000 nan 577.000 516.000 nan 36 23.000 nan 1430.000 1374.000 nan 0 0.000 nan 0 0.000 nan 0 0.000 nan 46 63.000 nan 0 0.000 nan 0 0.000 nan 4.265 4.265 nan 4.265 0.000 nan 0.000 0.000 nan 4.265 4.265 nan 8.359 4.265 nan 0.054 0.054 nan 0.054 0.000 nan 0.000 0.000 nan 0.054 0.054 nan 0.106 0.054 nan 0.633 0.633 nan 0.633 0.000 nan 0.000 0.000 nan 0.633 0.633 nan 1.241 0.633 nan 1.854 1.854 nan 1.854 0.000 nan 0.000 0.000 nan 1.854 1.854 nan 3.634 1.854 nan 0.000 0.000 nan 0.000 0.000 nan 0.000 0.000 nan 0.000 0.000 nan 0.000 0.000 nan 0.332 0.332 nan 0.332 0.000 nan 0.000 0.000 nan 0.332 0.332 nan 0.651 0.332 nan 0.137 0.137 nan 0.137 0.000 nan 0.000 0.000 nan 0.137 0.137 nan 0.269 0.137 nan 1.256 1.256 nan 1.256 0.000 nan 0.000 0.000 nan 1.256 1.256 nan 2.462 1.256 nan 0.000 0.000 nan 0.000 0.000 nan 0.000 0.000 nan 0.000 0.000 nan 0.000 0.000 nan 0.000 0.000 nan 0.000 0.000 nan 0.000 0.000 nan 0.000 0.000 nan 0.000 0.000 nan 54.600 54.600 nan 54.600 0.000 nan 0.000 0.000 nan 1177.000 1177.000 nan 1177.000 0.000 nan 0.000 0.000 nan 31.500 28.000 nan 14.500 12.500 nan 23.060 20.100 nan 0.000 0.283 nan 5.060 2.383 nan 1.080 4.417 nan 0.000 0.000 nan 1.080 4.417 nan 0.000 0.000 nan 0 0.000 nan
28 3451151.000 5371525.000 4573783.133 32.208 32.967 31.451 0.000 0.000 0.000 0.000 0.000 0.000 7.201 6.985 7.369 2.735 2.746 2.863 1244 1868.000 1583.000 1072 1610.000 1366.000 2 7.000 8.000 43 50.000 33.000 0 1.000 1.000 420.000 574.000 516.000 20 42.000 23.000 1025.000 1623.000 1374.000 0 0.000 0.000 0 0.000 0.000 0 0.000 0.000 44 69.000 63.000 1 1.000 0.000 0 0.000 0.000 4.265 4.265 4.265 4.265 0.000 0.000 0.000 0.000 0.000 4.265 4.265 4.265 9.281 6.824 4.265 0.054 0.054 0.054 0.054 0.000 0.000 0.000 0.000 0.000 0.054 0.054 0.054 0.118 0.086 0.054 0.633 0.633 0.633 0.633 0.000 0.000 0.000 0.000 0.000 0.633 0.633 0.633 1.377 1.013 0.633 1.854 1.854 1.854 1.854 0.000 0.000 0.000 0.000 0.000 1.854 1.854 1.854 4.034 2.966 1.854 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.332 0.332 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.332 0.722 0.531 0.332 0.137 0.137 0.137 0.137 0.000 0.000 0.000 0.000 0.000 0.137 0.137 0.137 0.298 0.219 0.137 1.256 1.256 1.256 1.256 0.000 0.000 0.000 0.000 0.000 1.256 1.256 1.256 2.733 2.010 1.256 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 54.600 54.600 54.600 0.000 0.000 0.000 0.000 0.000 1177.000 1177.000 1177.000 1177.000 0.000 0.000 0.000 0.000 0.000 33.500 33.000 28.000 16.000 11.000 12.500 24.567 23.183 20.100 0.000 0.000 0.283 6.567 5.183 2.383 4.633 1.400 4.417 0.000 0.000 0.000 4.633 1.400 4.417 0.000 0.000 0.000 0 2.000 0.000
29 2599.000 4679828.000 5371525.000 16.130 32.357 32.967 0.000 0.000 0.000 0.000 0.000 0.000 9.000 7.072 6.985 2.000 2.861 2.746 0 1758.000 1868.000 0 1569.000 1610.000 0 4.000 7.000 0 56.000 50.000 0 0.000 1.000 0.000 577.000 574.000 0 36.000 42.000 1.000 1430.000 1623.000 0 0.000 0.000 0 0.000 0.000 0 0.000 0.000 0 46.000 69.000 0 0.000 1.000 0 0.000 0.000 1.013 4.265 4.265 3.181 4.265 0.000 3.615 0.000 0.000 3.542 4.265 4.265 6.581 8.359 6.824 0.001 0.054 0.054 0.036 0.054 0.000 0.043 0.000 0.000 0.042 0.054 0.054 0.072 0.106 0.086 0.256 0.633 0.633 0.507 0.633 0.000 0.558 0.000 0.000 0.549 0.633 0.633 1.082 1.241 1.013 0.213 1.854 1.854 1.307 1.854 0.000 1.526 0.000 0.000 1.489 1.854 1.854 2.634 3.634 2.966 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.026 0.332 0.332 0.230 0.332 0.000 0.271 0.000 0.000 0.264 0.332 0.332 0.459 0.651 0.531 0.015 0.137 0.137 0.096 0.137 0.000 0.113 0.000 0.000 0.110 0.137 0.137 0.194 0.269 0.219 0.503 1.256 1.256 1.005 1.256 0.000 1.105 0.000 0.000 1.089 1.256 1.256 2.143 2.462 2.010 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 59.987 54.600 54.600 56.395 54.600 0.000 55.677 0.000 0.000 1206.000 1177.000 1177.000 1186.667 1177.000 0.000 1182.800 0.000 0.000 28.500 31.500 33.000 15.000 14.500 11.000 21.650 23.060 23.183 0.000 0.000 0.000 3.650 5.060 5.183 0.350 1.080 1.400 0.000 0.000 0.000 0.350 1.080 1.400 0.000 0.000 0.000 0 0.000 2.000

Creating new lag(Lag of 1 week) variables for the dependent variable (GMV) as well as independent variables

In [346]:
cameraaccessory_dladd_df = lag_variables(cameraaccessory_dladd_df,cameraaccessory_dladd_df_columns,1)
gamingaccessory_dladd_df = lag_variables(gamingaccessory_dladd_df,gamingaccessory_dladd_df_columns,1)
homeaudio_dladd_df = lag_variables(homeaudio_dladd_df,homeaudio_dladd_df_columns,1)
homeaudio_dladd_df.head()
Out[346]:
gmv gmv_lag1 gmv_lag2 gmv_lag3 Discount% Discount%_lag1 Discount%_lag2 Discount%_lag3 deliverybdays deliverybdays_lag1 deliverybdays_lag2 deliverybdays_lag3 deliverycdays deliverycdays_lag1 deliverycdays_lag2 deliverycdays_lag3 sla sla_lag1 sla_lag2 sla_lag3 product_procurement_sla product_procurement_sla_lag1 product_procurement_sla_lag2 product_procurement_sla_lag3 is_cod is_cod_lag1 is_cod_lag2 is_cod_lag3 is_mass_market is_mass_market_lag1 is_mass_market_lag2 is_mass_market_lag3 product_vertical_djcontroller product_vertical_djcontroller_lag1 product_vertical_djcontroller_lag2 product_vertical_djcontroller_lag3 product_vertical_dock product_vertical_dock_lag1 product_vertical_dock_lag2 product_vertical_dock_lag3 product_vertical_dockingstation product_vertical_dockingstation_lag1 product_vertical_dockingstation_lag2 product_vertical_dockingstation_lag3 product_vertical_fmradio product_vertical_fmradio_lag1 product_vertical_fmradio_lag2 product_vertical_fmradio_lag3 product_vertical_hifisystem product_vertical_hifisystem_lag1 product_vertical_hifisystem_lag2 product_vertical_hifisystem_lag3 product_vertical_homeaudiospeaker product_vertical_homeaudiospeaker_lag1 product_vertical_homeaudiospeaker_lag2 product_vertical_homeaudiospeaker_lag3 product_vertical_karaokeplayer product_vertical_karaokeplayer_lag1 product_vertical_karaokeplayer_lag2 product_vertical_karaokeplayer_lag3 product_vertical_slingbox product_vertical_slingbox_lag1 product_vertical_slingbox_lag2 product_vertical_slingbox_lag3 product_vertical_soundmixer product_vertical_soundmixer_lag1 product_vertical_soundmixer_lag2 product_vertical_soundmixer_lag3 product_vertical_voicerecorder product_vertical_voicerecorder_lag1 product_vertical_voicerecorder_lag2 product_vertical_voicerecorder_lag3 payday_week payday_week_lag1 payday_week_lag2 payday_week_lag3 holiday_week holiday_week_lag1 holiday_week_lag2 holiday_week_lag3 Total Investment Total Investment_lag1 Total Investment_lag2 Total Investment_lag3 Total Investment_SMA_3 Total Investment_SMA_3_lag1 Total Investment_SMA_3_lag2 Total Investment_SMA_3_lag3 Total Investment_SMA_5 Total Investment_SMA_5_lag1 Total Investment_SMA_5_lag2 Total Investment_SMA_5_lag3 Total Investment_EMA_8 Total Investment_EMA_8_lag1 Total Investment_EMA_8_lag2 Total Investment_EMA_8_lag3 Total_Investment_Ad_Stock Total_Investment_Ad_Stock_lag1 Total_Investment_Ad_Stock_lag2 Total_Investment_Ad_Stock_lag3 TV TV_lag1 TV_lag2 TV_lag3 TV_SMA_3 TV_SMA_3_lag1 TV_SMA_3_lag2 TV_SMA_3_lag3 TV_SMA_5 TV_SMA_5_lag1 TV_SMA_5_lag2 TV_SMA_5_lag3 TV_EMA_8 TV_EMA_8_lag1 TV_EMA_8_lag2 TV_EMA_8_lag3 TV_Ad_Stock TV_Ad_Stock_lag1 TV_Ad_Stock_lag2 TV_Ad_Stock_lag3 Digital Digital_lag1 Digital_lag2 Digital_lag3 Digital_SMA_3 Digital_SMA_3_lag1 Digital_SMA_3_lag2 Digital_SMA_3_lag3 Digital_SMA_5 Digital_SMA_5_lag1 Digital_SMA_5_lag2 Digital_SMA_5_lag3 Digital_EMA_8 Digital_EMA_8_lag1 Digital_EMA_8_lag2 Digital_EMA_8_lag3 Digital_Ad_Stock Digital_Ad_Stock_lag1 Digital_Ad_Stock_lag2 Digital_Ad_Stock_lag3 Sponsorship Sponsorship_lag1 Sponsorship_lag2 Sponsorship_lag3 Sponsorship_SMA_3 Sponsorship_SMA_3_lag1 Sponsorship_SMA_3_lag2 Sponsorship_SMA_3_lag3 Sponsorship_SMA_5 Sponsorship_SMA_5_lag1 Sponsorship_SMA_5_lag2 Sponsorship_SMA_5_lag3 Sponsorship_EMA_8 Sponsorship_EMA_8_lag1 Sponsorship_EMA_8_lag2 Sponsorship_EMA_8_lag3 Sponsorship_Ad_Stock Sponsorship_Ad_Stock_lag1 Sponsorship_Ad_Stock_lag2 Sponsorship_Ad_Stock_lag3 Content Marketing Content Marketing_lag1 Content Marketing_lag2 Content Marketing_lag3 Content Marketing_SMA_3 Content Marketing_SMA_3_lag1 Content Marketing_SMA_3_lag2 Content Marketing_SMA_3_lag3 Content Marketing_SMA_5 Content Marketing_SMA_5_lag1 Content Marketing_SMA_5_lag2 Content Marketing_SMA_5_lag3 Content Marketing_EMA_8 Content Marketing_EMA_8_lag1 Content Marketing_EMA_8_lag2 Content Marketing_EMA_8_lag3 Content_Marketing_Ad_Stock Content_Marketing_Ad_Stock_lag1 Content_Marketing_Ad_Stock_lag2 Content_Marketing_Ad_Stock_lag3 Online marketing Online marketing_lag1 Online marketing_lag2 Online marketing_lag3 Online marketing_SMA_3 Online marketing_SMA_3_lag1 Online marketing_SMA_3_lag2 Online marketing_SMA_3_lag3 Online marketing_SMA_5 Online marketing_SMA_5_lag1 Online marketing_SMA_5_lag2 Online marketing_SMA_5_lag3 Online marketing_EMA_8 Online marketing_EMA_8_lag1 Online marketing_EMA_8_lag2 Online marketing_EMA_8_lag3 Online_marketing_Ad_Stock Online_marketing_Ad_Stock_lag1 Online_marketing_Ad_Stock_lag2 Online_marketing_Ad_Stock_lag3 Affiliates Affiliates_lag1 Affiliates_lag2 Affiliates_lag3 Affiliates_SMA_3 Affiliates_SMA_3_lag1 Affiliates_SMA_3_lag2 Affiliates_SMA_3_lag3 Affiliates_SMA_5 Affiliates_SMA_5_lag1 Affiliates_SMA_5_lag2 Affiliates_SMA_5_lag3 Affiliates_EMA_8 Affiliates_EMA_8_lag1 Affiliates_EMA_8_lag2 Affiliates_EMA_8_lag3 Affiliates_Ad_Stock Affiliates_Ad_Stock_lag1 Affiliates_Ad_Stock_lag2 Affiliates_Ad_Stock_lag3 SEM SEM_lag1 SEM_lag2 SEM_lag3 SEM_SMA_3 SEM_SMA_3_lag1 SEM_SMA_3_lag2 SEM_SMA_3_lag3 SEM_SMA_5 SEM_SMA_5_lag1 SEM_SMA_5_lag2 SEM_SMA_5_lag3 SEM_EMA_8 SEM_EMA_8_lag1 SEM_EMA_8_lag2 SEM_EMA_8_lag3 SEM_Ad_Stock SEM_Ad_Stock_lag1 SEM_Ad_Stock_lag2 SEM_Ad_Stock_lag3 Radio Radio_lag1 Radio_lag2 Radio_lag3 Radio_SMA_3 Radio_SMA_3_lag1 Radio_SMA_3_lag2 Radio_SMA_3_lag3 Radio_SMA_5 Radio_SMA_5_lag1 Radio_SMA_5_lag2 Radio_SMA_5_lag3 Radio_EMA_8 Radio_EMA_8_lag1 Radio_EMA_8_lag2 Radio_EMA_8_lag3 Radio_Ad_Stock Radio_Ad_Stock_lag1 Radio_Ad_Stock_lag2 Radio_Ad_Stock_lag3 Other Other_lag1 Other_lag2 Other_lag3 Other_SMA_3 Other_SMA_3_lag1 Other_SMA_3_lag2 Other_SMA_3_lag3 Other_SMA_5 Other_SMA_5_lag1 Other_SMA_5_lag2 Other_SMA_5_lag3 Other_EMA_8 Other_EMA_8_lag1 Other_EMA_8_lag2 Other_EMA_8_lag3 Other_Ad_Stock Other_Ad_Stock_lag1 Other_Ad_Stock_lag2 Other_Ad_Stock_lag3 NPS NPS_lag1 NPS_lag2 NPS_lag3 NPS_SMA_3 NPS_SMA_3_lag1 NPS_SMA_3_lag2 NPS_SMA_3_lag3 NPS_SMA_5 NPS_SMA_5_lag1 NPS_SMA_5_lag2 NPS_SMA_5_lag3 Stock Index Stock Index_lag1 Stock Index_lag2 Stock Index_lag3 Stock Index_SMA_3 Stock Index_SMA_3_lag1 Stock Index_SMA_3_lag2 Stock Index_SMA_3_lag3 Stock Index_SMA_5 Stock Index_SMA_5_lag1 Stock Index_SMA_5_lag2 Stock Index_SMA_5_lag3 Max Temp Max Temp_lag1 Max Temp_lag2 Max Temp_lag3 Min Temp Min Temp_lag1 Min Temp_lag2 Min Temp_lag3 Mean Temp Mean Temp_lag1 Mean Temp_lag2 Mean Temp_lag3 Heat Deg Days Heat Deg Days_lag1 Heat Deg Days_lag2 Heat Deg Days_lag3 Cool Deg Days Cool Deg Days_lag1 Cool Deg Days_lag2 Cool Deg Days_lag3 Total Rain (mm) Total Rain (mm)_lag1 Total Rain (mm)_lag2 Total Rain (mm)_lag3 Total Snow (cm) Total Snow (cm)_lag1 Total Snow (cm)_lag2 Total Snow (cm)_lag3 Total Precip (mm) Total Precip (mm)_lag1 Total Precip (mm)_lag2 Total Precip (mm)_lag3 Snow on Grnd (cm) Snow on Grnd (cm)_lag1 Snow on Grnd (cm)_lag2 Snow on Grnd (cm)_lag3 Sale Sale_lag1 Sale_lag2 Sale_lag3
25 4573783.133 nan nan nan 31.451 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 7.369 nan nan nan 2.863 nan nan nan 1583 nan nan nan 1366 nan nan nan 8 nan nan nan 33 nan nan nan 1 nan nan nan 516.000 nan nan nan 23 nan nan nan 1374.000 nan nan nan 0 nan nan nan 0 nan nan nan 0 nan nan nan 63 nan nan nan 0 nan nan nan 0 nan nan nan 4.265 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 4.265 nan nan nan 4.265 nan nan nan 0.054 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 0.054 nan nan nan 0.054 nan nan nan 0.633 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 0.633 nan nan nan 0.633 nan nan nan 1.854 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 1.854 nan nan nan 1.854 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 0.332 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 0.332 nan nan nan 0.332 nan nan nan 0.137 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 0.137 nan nan nan 0.137 nan nan nan 1.256 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 1.256 nan nan nan 1.256 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 54.600 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 1177.000 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 28.000 nan nan nan 12.500 nan nan nan 20.100 nan nan nan 0.283 nan nan nan 2.383 nan nan nan 4.417 nan nan nan 0.000 nan nan nan 4.417 nan nan nan 0.000 nan nan nan 0 nan nan nan
26 5371525.000 4573783.133 nan nan 32.967 31.451 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 6.985 7.369 nan nan 2.746 2.863 nan nan 1868 1583.000 nan nan 1610 1366.000 nan nan 7 8.000 nan nan 50 33.000 nan nan 1 1.000 nan nan 574.000 516.000 nan nan 42 23.000 nan nan 1623.000 1374.000 nan nan 0 0.000 nan nan 0 0.000 nan nan 0 0.000 nan nan 69 63.000 nan nan 1 0.000 nan nan 0 0.000 nan nan 4.265 4.265 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 4.265 4.265 nan nan 6.824 4.265 nan nan 0.054 0.054 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 0.054 0.054 nan nan 0.086 0.054 nan nan 0.633 0.633 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 0.633 0.633 nan nan 1.013 0.633 nan nan 1.854 1.854 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 1.854 1.854 nan nan 2.966 1.854 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 0.332 0.332 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 0.332 0.332 nan nan 0.531 0.332 nan nan 0.137 0.137 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 0.137 0.137 nan nan 0.219 0.137 nan nan 1.256 1.256 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 1.256 1.256 nan nan 2.010 1.256 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 54.600 54.600 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 1177.000 1177.000 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 33.000 28.000 nan nan 11.000 12.500 nan nan 23.183 20.100 nan nan 0.000 0.283 nan nan 5.183 2.383 nan nan 1.400 4.417 nan nan 0.000 0.000 nan nan 1.400 4.417 nan nan 0.000 0.000 nan nan 2 0.000 nan nan
27 4679828.000 5371525.000 4573783.133 nan 32.357 32.967 31.451 nan 0.000 0.000 0.000 nan 0.000 0.000 0.000 nan 7.072 6.985 7.369 nan 2.861 2.746 2.863 nan 1758 1868.000 1583.000 nan 1569 1610.000 1366.000 nan 4 7.000 8.000 nan 56 50.000 33.000 nan 0 1.000 1.000 nan 577.000 574.000 516.000 nan 36 42.000 23.000 nan 1430.000 1623.000 1374.000 nan 0 0.000 0.000 nan 0 0.000 0.000 nan 0 0.000 0.000 nan 46 69.000 63.000 nan 0 1.000 0.000 nan 0 0.000 0.000 nan 4.265 4.265 4.265 nan 4.265 0.000 0.000 nan 0.000 0.000 0.000 nan 4.265 4.265 4.265 nan 8.359 6.824 4.265 nan 0.054 0.054 0.054 nan 0.054 0.000 0.000 nan 0.000 0.000 0.000 nan 0.054 0.054 0.054 nan 0.106 0.086 0.054 nan 0.633 0.633 0.633 nan 0.633 0.000 0.000 nan 0.000 0.000 0.000 nan 0.633 0.633 0.633 nan 1.241 1.013 0.633 nan 1.854 1.854 1.854 nan 1.854 0.000 0.000 nan 0.000 0.000 0.000 nan 1.854 1.854 1.854 nan 3.634 2.966 1.854 nan 0.000 0.000 0.000 nan 0.000 0.000 0.000 nan 0.000 0.000 0.000 nan 0.000 0.000 0.000 nan 0.000 0.000 0.000 nan 0.332 0.332 0.332 nan 0.332 0.000 0.000 nan 0.000 0.000 0.000 nan 0.332 0.332 0.332 nan 0.651 0.531 0.332 nan 0.137 0.137 0.137 nan 0.137 0.000 0.000 nan 0.000 0.000 0.000 nan 0.137 0.137 0.137 nan 0.269 0.219 0.137 nan 1.256 1.256 1.256 nan 1.256 0.000 0.000 nan 0.000 0.000 0.000 nan 1.256 1.256 1.256 nan 2.462 2.010 1.256 nan 0.000 0.000 0.000 nan 0.000 0.000 0.000 nan 0.000 0.000 0.000 nan 0.000 0.000 0.000 nan 0.000 0.000 0.000 nan 0.000 0.000 0.000 nan 0.000 0.000 0.000 nan 0.000 0.000 0.000 nan 0.000 0.000 0.000 nan 0.000 0.000 0.000 nan 54.600 54.600 54.600 nan 54.600 0.000 0.000 nan 0.000 0.000 0.000 nan 1177.000 1177.000 1177.000 nan 1177.000 0.000 0.000 nan 0.000 0.000 0.000 nan 31.500 33.000 28.000 nan 14.500 11.000 12.500 nan 23.060 23.183 20.100 nan 0.000 0.000 0.283 nan 5.060 5.183 2.383 nan 1.080 1.400 4.417 nan 0.000 0.000 0.000 nan 1.080 1.400 4.417 nan 0.000 0.000 0.000 nan 0 2.000 0.000 nan
28 3451151.000 4679828.000 5371525.000 4573783.133 32.208 32.357 32.967 31.451 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 7.201 7.072 6.985 7.369 2.735 2.861 2.746 2.863 1244 1758.000 1868.000 1583.000 1072 1569.000 1610.000 1366.000 2 4.000 7.000 8.000 43 56.000 50.000 33.000 0 0.000 1.000 1.000 420.000 577.000 574.000 516.000 20 36.000 42.000 23.000 1025.000 1430.000 1623.000 1374.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 44 46.000 69.000 63.000 1 0.000 1.000 0.000 0 0.000 0.000 0.000 4.265 4.265 4.265 4.265 4.265 4.265 0.000 0.000 0.000 0.000 0.000 0.000 4.265 4.265 4.265 4.265 9.281 8.359 6.824 4.265 0.054 0.054 0.054 0.054 0.054 0.054 0.000 0.000 0.000 0.000 0.000 0.000 0.054 0.054 0.054 0.054 0.118 0.106 0.086 0.054 0.633 0.633 0.633 0.633 0.633 0.633 0.000 0.000 0.000 0.000 0.000 0.000 0.633 0.633 0.633 0.633 1.377 1.241 1.013 0.633 1.854 1.854 1.854 1.854 1.854 1.854 0.000 0.000 0.000 0.000 0.000 0.000 1.854 1.854 1.854 1.854 4.034 3.634 2.966 1.854 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.332 0.332 0.332 0.332 0.000 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.332 0.332 0.722 0.651 0.531 0.332 0.137 0.137 0.137 0.137 0.137 0.137 0.000 0.000 0.000 0.000 0.000 0.000 0.137 0.137 0.137 0.137 0.298 0.269 0.219 0.137 1.256 1.256 1.256 1.256 1.256 1.256 0.000 0.000 0.000 0.000 0.000 0.000 1.256 1.256 1.256 1.256 2.733 2.462 2.010 1.256 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 54.600 54.600 54.600 54.600 54.600 0.000 0.000 0.000 0.000 0.000 0.000 1177.000 1177.000 1177.000 1177.000 1177.000 1177.000 0.000 0.000 0.000 0.000 0.000 0.000 33.500 31.500 33.000 28.000 16.000 14.500 11.000 12.500 24.567 23.060 23.183 20.100 0.000 0.000 0.000 0.283 6.567 5.060 5.183 2.383 4.633 1.080 1.400 4.417 0.000 0.000 0.000 0.000 4.633 1.080 1.400 4.417 0.000 0.000 0.000 0.000 0 0.000 2.000 0.000
29 2599.000 3451151.000 4679828.000 5371525.000 16.130 32.208 32.357 32.967 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 9.000 7.201 7.072 6.985 2.000 2.735 2.861 2.746 0 1244.000 1758.000 1868.000 0 1072.000 1569.000 1610.000 0 2.000 4.000 7.000 0 43.000 56.000 50.000 0 0.000 0.000 1.000 0.000 420.000 577.000 574.000 0 20.000 36.000 42.000 1.000 1025.000 1430.000 1623.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 0 44.000 46.000 69.000 0 1.000 0.000 1.000 0 0.000 0.000 0.000 1.013 4.265 4.265 4.265 3.181 4.265 4.265 0.000 3.615 0.000 0.000 0.000 3.542 4.265 4.265 4.265 6.581 9.281 8.359 6.824 0.001 0.054 0.054 0.054 0.036 0.054 0.054 0.000 0.043 0.000 0.000 0.000 0.042 0.054 0.054 0.054 0.072 0.118 0.106 0.086 0.256 0.633 0.633 0.633 0.507 0.633 0.633 0.000 0.558 0.000 0.000 0.000 0.549 0.633 0.633 0.633 1.082 1.377 1.241 1.013 0.213 1.854 1.854 1.854 1.307 1.854 1.854 0.000 1.526 0.000 0.000 0.000 1.489 1.854 1.854 1.854 2.634 4.034 3.634 2.966 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.026 0.332 0.332 0.332 0.230 0.332 0.332 0.000 0.271 0.000 0.000 0.000 0.264 0.332 0.332 0.332 0.459 0.722 0.651 0.531 0.015 0.137 0.137 0.137 0.096 0.137 0.137 0.000 0.113 0.000 0.000 0.000 0.110 0.137 0.137 0.137 0.194 0.298 0.269 0.219 0.503 1.256 1.256 1.256 1.005 1.256 1.256 0.000 1.105 0.000 0.000 0.000 1.089 1.256 1.256 1.256 2.143 2.733 2.462 2.010 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 59.987 54.600 54.600 54.600 56.395 54.600 54.600 0.000 55.677 0.000 0.000 0.000 1206.000 1177.000 1177.000 1177.000 1186.667 1177.000 1177.000 0.000 1182.800 0.000 0.000 0.000 28.500 33.500 31.500 33.000 15.000 16.000 14.500 11.000 21.650 24.567 23.060 23.183 0.000 0.000 0.000 0.000 3.650 6.567 5.060 5.183 0.350 4.633 1.080 1.400 0.000 0.000 0.000 0.000 0.350 4.633 1.080 1.400 0.000 0.000 0.000 0.000 0 0.000 0.000 2.000

Imputing all null values with 0

In [347]:
# Imputing all null values with 0
cameraaccessory_dladd_df.fillna(value=0, inplace=True)
gamingaccessory_dladd_df.fillna(value=0, inplace=True)
homeaudio_dladd_df.fillna(value=0, inplace=True)
homeaudio_dladd_df.head(10)
Out[347]:
gmv gmv_lag1 gmv_lag2 gmv_lag3 Discount% Discount%_lag1 Discount%_lag2 Discount%_lag3 deliverybdays deliverybdays_lag1 deliverybdays_lag2 deliverybdays_lag3 deliverycdays deliverycdays_lag1 deliverycdays_lag2 deliverycdays_lag3 sla sla_lag1 sla_lag2 sla_lag3 product_procurement_sla product_procurement_sla_lag1 product_procurement_sla_lag2 product_procurement_sla_lag3 is_cod is_cod_lag1 is_cod_lag2 is_cod_lag3 is_mass_market is_mass_market_lag1 is_mass_market_lag2 is_mass_market_lag3 product_vertical_djcontroller product_vertical_djcontroller_lag1 product_vertical_djcontroller_lag2 product_vertical_djcontroller_lag3 product_vertical_dock product_vertical_dock_lag1 product_vertical_dock_lag2 product_vertical_dock_lag3 product_vertical_dockingstation product_vertical_dockingstation_lag1 product_vertical_dockingstation_lag2 product_vertical_dockingstation_lag3 product_vertical_fmradio product_vertical_fmradio_lag1 product_vertical_fmradio_lag2 product_vertical_fmradio_lag3 product_vertical_hifisystem product_vertical_hifisystem_lag1 product_vertical_hifisystem_lag2 product_vertical_hifisystem_lag3 product_vertical_homeaudiospeaker product_vertical_homeaudiospeaker_lag1 product_vertical_homeaudiospeaker_lag2 product_vertical_homeaudiospeaker_lag3 product_vertical_karaokeplayer product_vertical_karaokeplayer_lag1 product_vertical_karaokeplayer_lag2 product_vertical_karaokeplayer_lag3 product_vertical_slingbox product_vertical_slingbox_lag1 product_vertical_slingbox_lag2 product_vertical_slingbox_lag3 product_vertical_soundmixer product_vertical_soundmixer_lag1 product_vertical_soundmixer_lag2 product_vertical_soundmixer_lag3 product_vertical_voicerecorder product_vertical_voicerecorder_lag1 product_vertical_voicerecorder_lag2 product_vertical_voicerecorder_lag3 payday_week payday_week_lag1 payday_week_lag2 payday_week_lag3 holiday_week holiday_week_lag1 holiday_week_lag2 holiday_week_lag3 Total Investment Total Investment_lag1 Total Investment_lag2 Total Investment_lag3 Total Investment_SMA_3 Total Investment_SMA_3_lag1 Total Investment_SMA_3_lag2 Total Investment_SMA_3_lag3 Total Investment_SMA_5 Total Investment_SMA_5_lag1 Total Investment_SMA_5_lag2 Total Investment_SMA_5_lag3 Total Investment_EMA_8 Total Investment_EMA_8_lag1 Total Investment_EMA_8_lag2 Total Investment_EMA_8_lag3 Total_Investment_Ad_Stock Total_Investment_Ad_Stock_lag1 Total_Investment_Ad_Stock_lag2 Total_Investment_Ad_Stock_lag3 TV TV_lag1 TV_lag2 TV_lag3 TV_SMA_3 TV_SMA_3_lag1 TV_SMA_3_lag2 TV_SMA_3_lag3 TV_SMA_5 TV_SMA_5_lag1 TV_SMA_5_lag2 TV_SMA_5_lag3 TV_EMA_8 TV_EMA_8_lag1 TV_EMA_8_lag2 TV_EMA_8_lag3 TV_Ad_Stock TV_Ad_Stock_lag1 TV_Ad_Stock_lag2 TV_Ad_Stock_lag3 Digital Digital_lag1 Digital_lag2 Digital_lag3 Digital_SMA_3 Digital_SMA_3_lag1 Digital_SMA_3_lag2 Digital_SMA_3_lag3 Digital_SMA_5 Digital_SMA_5_lag1 Digital_SMA_5_lag2 Digital_SMA_5_lag3 Digital_EMA_8 Digital_EMA_8_lag1 Digital_EMA_8_lag2 Digital_EMA_8_lag3 Digital_Ad_Stock Digital_Ad_Stock_lag1 Digital_Ad_Stock_lag2 Digital_Ad_Stock_lag3 Sponsorship Sponsorship_lag1 Sponsorship_lag2 Sponsorship_lag3 Sponsorship_SMA_3 Sponsorship_SMA_3_lag1 Sponsorship_SMA_3_lag2 Sponsorship_SMA_3_lag3 Sponsorship_SMA_5 Sponsorship_SMA_5_lag1 Sponsorship_SMA_5_lag2 Sponsorship_SMA_5_lag3 Sponsorship_EMA_8 Sponsorship_EMA_8_lag1 Sponsorship_EMA_8_lag2 Sponsorship_EMA_8_lag3 Sponsorship_Ad_Stock Sponsorship_Ad_Stock_lag1 Sponsorship_Ad_Stock_lag2 Sponsorship_Ad_Stock_lag3 Content Marketing Content Marketing_lag1 Content Marketing_lag2 Content Marketing_lag3 Content Marketing_SMA_3 Content Marketing_SMA_3_lag1 Content Marketing_SMA_3_lag2 Content Marketing_SMA_3_lag3 Content Marketing_SMA_5 Content Marketing_SMA_5_lag1 Content Marketing_SMA_5_lag2 Content Marketing_SMA_5_lag3 Content Marketing_EMA_8 Content Marketing_EMA_8_lag1 Content Marketing_EMA_8_lag2 Content Marketing_EMA_8_lag3 Content_Marketing_Ad_Stock Content_Marketing_Ad_Stock_lag1 Content_Marketing_Ad_Stock_lag2 Content_Marketing_Ad_Stock_lag3 Online marketing Online marketing_lag1 Online marketing_lag2 Online marketing_lag3 Online marketing_SMA_3 Online marketing_SMA_3_lag1 Online marketing_SMA_3_lag2 Online marketing_SMA_3_lag3 Online marketing_SMA_5 Online marketing_SMA_5_lag1 Online marketing_SMA_5_lag2 Online marketing_SMA_5_lag3 Online marketing_EMA_8 Online marketing_EMA_8_lag1 Online marketing_EMA_8_lag2 Online marketing_EMA_8_lag3 Online_marketing_Ad_Stock Online_marketing_Ad_Stock_lag1 Online_marketing_Ad_Stock_lag2 Online_marketing_Ad_Stock_lag3 Affiliates Affiliates_lag1 Affiliates_lag2 Affiliates_lag3 Affiliates_SMA_3 Affiliates_SMA_3_lag1 Affiliates_SMA_3_lag2 Affiliates_SMA_3_lag3 Affiliates_SMA_5 Affiliates_SMA_5_lag1 Affiliates_SMA_5_lag2 Affiliates_SMA_5_lag3 Affiliates_EMA_8 Affiliates_EMA_8_lag1 Affiliates_EMA_8_lag2 Affiliates_EMA_8_lag3 Affiliates_Ad_Stock Affiliates_Ad_Stock_lag1 Affiliates_Ad_Stock_lag2 Affiliates_Ad_Stock_lag3 SEM SEM_lag1 SEM_lag2 SEM_lag3 SEM_SMA_3 SEM_SMA_3_lag1 SEM_SMA_3_lag2 SEM_SMA_3_lag3 SEM_SMA_5 SEM_SMA_5_lag1 SEM_SMA_5_lag2 SEM_SMA_5_lag3 SEM_EMA_8 SEM_EMA_8_lag1 SEM_EMA_8_lag2 SEM_EMA_8_lag3 SEM_Ad_Stock SEM_Ad_Stock_lag1 SEM_Ad_Stock_lag2 SEM_Ad_Stock_lag3 Radio Radio_lag1 Radio_lag2 Radio_lag3 Radio_SMA_3 Radio_SMA_3_lag1 Radio_SMA_3_lag2 Radio_SMA_3_lag3 Radio_SMA_5 Radio_SMA_5_lag1 Radio_SMA_5_lag2 Radio_SMA_5_lag3 Radio_EMA_8 Radio_EMA_8_lag1 Radio_EMA_8_lag2 Radio_EMA_8_lag3 Radio_Ad_Stock Radio_Ad_Stock_lag1 Radio_Ad_Stock_lag2 Radio_Ad_Stock_lag3 Other Other_lag1 Other_lag2 Other_lag3 Other_SMA_3 Other_SMA_3_lag1 Other_SMA_3_lag2 Other_SMA_3_lag3 Other_SMA_5 Other_SMA_5_lag1 Other_SMA_5_lag2 Other_SMA_5_lag3 Other_EMA_8 Other_EMA_8_lag1 Other_EMA_8_lag2 Other_EMA_8_lag3 Other_Ad_Stock Other_Ad_Stock_lag1 Other_Ad_Stock_lag2 Other_Ad_Stock_lag3 NPS NPS_lag1 NPS_lag2 NPS_lag3 NPS_SMA_3 NPS_SMA_3_lag1 NPS_SMA_3_lag2 NPS_SMA_3_lag3 NPS_SMA_5 NPS_SMA_5_lag1 NPS_SMA_5_lag2 NPS_SMA_5_lag3 Stock Index Stock Index_lag1 Stock Index_lag2 Stock Index_lag3 Stock Index_SMA_3 Stock Index_SMA_3_lag1 Stock Index_SMA_3_lag2 Stock Index_SMA_3_lag3 Stock Index_SMA_5 Stock Index_SMA_5_lag1 Stock Index_SMA_5_lag2 Stock Index_SMA_5_lag3 Max Temp Max Temp_lag1 Max Temp_lag2 Max Temp_lag3 Min Temp Min Temp_lag1 Min Temp_lag2 Min Temp_lag3 Mean Temp Mean Temp_lag1 Mean Temp_lag2 Mean Temp_lag3 Heat Deg Days Heat Deg Days_lag1 Heat Deg Days_lag2 Heat Deg Days_lag3 Cool Deg Days Cool Deg Days_lag1 Cool Deg Days_lag2 Cool Deg Days_lag3 Total Rain (mm) Total Rain (mm)_lag1 Total Rain (mm)_lag2 Total Rain (mm)_lag3 Total Snow (cm) Total Snow (cm)_lag1 Total Snow (cm)_lag2 Total Snow (cm)_lag3 Total Precip (mm) Total Precip (mm)_lag1 Total Precip (mm)_lag2 Total Precip (mm)_lag3 Snow on Grnd (cm) Snow on Grnd (cm)_lag1 Snow on Grnd (cm)_lag2 Snow on Grnd (cm)_lag3 Sale Sale_lag1 Sale_lag2 Sale_lag3
25 4573783.133 0.000 0.000 0.000 31.451 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 7.369 0.000 0.000 0.000 2.863 0.000 0.000 0.000 1583 0.000 0.000 0.000 1366 0.000 0.000 0.000 8 0.000 0.000 0.000 33 0.000 0.000 0.000 1 0.000 0.000 0.000 516.000 0.000 0.000 0.000 23 0.000 0.000 0.000 1374.000 0.000 0.000 0.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 63 0.000 0.000 0.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 4.265 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 4.265 0.000 0.000 0.000 4.265 0.000 0.000 0.000 0.054 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.054 0.000 0.000 0.000 0.054 0.000 0.000 0.000 0.633 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.633 0.000 0.000 0.000 0.633 0.000 0.000 0.000 1.854 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.854 0.000 0.000 0.000 1.854 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.332 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.332 0.000 0.000 0.000 0.332 0.000 0.000 0.000 0.137 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.137 0.000 0.000 0.000 0.137 0.000 0.000 0.000 1.256 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.256 0.000 0.000 0.000 1.256 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1177.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 28.000 0.000 0.000 0.000 12.500 0.000 0.000 0.000 20.100 0.000 0.000 0.000 0.283 0.000 0.000 0.000 2.383 0.000 0.000 0.000 4.417 0.000 0.000 0.000 0.000 0.000 0.000 0.000 4.417 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0 0.000 0.000 0.000
26 5371525.000 4573783.133 0.000 0.000 32.967 31.451 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 6.985 7.369 0.000 0.000 2.746 2.863 0.000 0.000 1868 1583.000 0.000 0.000 1610 1366.000 0.000 0.000 7 8.000 0.000 0.000 50 33.000 0.000 0.000 1 1.000 0.000 0.000 574.000 516.000 0.000 0.000 42 23.000 0.000 0.000 1623.000 1374.000 0.000 0.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 69 63.000 0.000 0.000 1 0.000 0.000 0.000 0 0.000 0.000 0.000 4.265 4.265 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 4.265 4.265 0.000 0.000 6.824 4.265 0.000 0.000 0.054 0.054 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.054 0.054 0.000 0.000 0.086 0.054 0.000 0.000 0.633 0.633 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.633 0.633 0.000 0.000 1.013 0.633 0.000 0.000 1.854 1.854 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.854 1.854 0.000 0.000 2.966 1.854 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.000 0.000 0.531 0.332 0.000 0.000 0.137 0.137 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.137 0.137 0.000 0.000 0.219 0.137 0.000 0.000 1.256 1.256 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.256 1.256 0.000 0.000 2.010 1.256 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 54.600 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1177.000 1177.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 33.000 28.000 0.000 0.000 11.000 12.500 0.000 0.000 23.183 20.100 0.000 0.000 0.000 0.283 0.000 0.000 5.183 2.383 0.000 0.000 1.400 4.417 0.000 0.000 0.000 0.000 0.000 0.000 1.400 4.417 0.000 0.000 0.000 0.000 0.000 0.000 2 0.000 0.000 0.000
27 4679828.000 5371525.000 4573783.133 0.000 32.357 32.967 31.451 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 7.072 6.985 7.369 0.000 2.861 2.746 2.863 0.000 1758 1868.000 1583.000 0.000 1569 1610.000 1366.000 0.000 4 7.000 8.000 0.000 56 50.000 33.000 0.000 0 1.000 1.000 0.000 577.000 574.000 516.000 0.000 36 42.000 23.000 0.000 1430.000 1623.000 1374.000 0.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 46 69.000 63.000 0.000 0 1.000 0.000 0.000 0 0.000 0.000 0.000 4.265 4.265 4.265 0.000 4.265 0.000 0.000 0.000 0.000 0.000 0.000 0.000 4.265 4.265 4.265 0.000 8.359 6.824 4.265 0.000 0.054 0.054 0.054 0.000 0.054 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.054 0.054 0.054 0.000 0.106 0.086 0.054 0.000 0.633 0.633 0.633 0.000 0.633 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.633 0.633 0.633 0.000 1.241 1.013 0.633 0.000 1.854 1.854 1.854 0.000 1.854 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.854 1.854 1.854 0.000 3.634 2.966 1.854 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.332 0.000 0.332 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.332 0.000 0.651 0.531 0.332 0.000 0.137 0.137 0.137 0.000 0.137 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.137 0.137 0.137 0.000 0.269 0.219 0.137 0.000 1.256 1.256 1.256 0.000 1.256 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.256 1.256 1.256 0.000 2.462 2.010 1.256 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 54.600 54.600 0.000 54.600 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1177.000 1177.000 1177.000 0.000 1177.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 31.500 33.000 28.000 0.000 14.500 11.000 12.500 0.000 23.060 23.183 20.100 0.000 0.000 0.000 0.283 0.000 5.060 5.183 2.383 0.000 1.080 1.400 4.417 0.000 0.000 0.000 0.000 0.000 1.080 1.400 4.417 0.000 0.000 0.000 0.000 0.000 0 2.000 0.000 0.000
28 3451151.000 4679828.000 5371525.000 4573783.133 32.208 32.357 32.967 31.451 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 7.201 7.072 6.985 7.369 2.735 2.861 2.746 2.863 1244 1758.000 1868.000 1583.000 1072 1569.000 1610.000 1366.000 2 4.000 7.000 8.000 43 56.000 50.000 33.000 0 0.000 1.000 1.000 420.000 577.000 574.000 516.000 20 36.000 42.000 23.000 1025.000 1430.000 1623.000 1374.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 44 46.000 69.000 63.000 1 0.000 1.000 0.000 0 0.000 0.000 0.000 4.265 4.265 4.265 4.265 4.265 4.265 0.000 0.000 0.000 0.000 0.000 0.000 4.265 4.265 4.265 4.265 9.281 8.359 6.824 4.265 0.054 0.054 0.054 0.054 0.054 0.054 0.000 0.000 0.000 0.000 0.000 0.000 0.054 0.054 0.054 0.054 0.118 0.106 0.086 0.054 0.633 0.633 0.633 0.633 0.633 0.633 0.000 0.000 0.000 0.000 0.000 0.000 0.633 0.633 0.633 0.633 1.377 1.241 1.013 0.633 1.854 1.854 1.854 1.854 1.854 1.854 0.000 0.000 0.000 0.000 0.000 0.000 1.854 1.854 1.854 1.854 4.034 3.634 2.966 1.854 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.332 0.332 0.332 0.332 0.000 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.332 0.332 0.722 0.651 0.531 0.332 0.137 0.137 0.137 0.137 0.137 0.137 0.000 0.000 0.000 0.000 0.000 0.000 0.137 0.137 0.137 0.137 0.298 0.269 0.219 0.137 1.256 1.256 1.256 1.256 1.256 1.256 0.000 0.000 0.000 0.000 0.000 0.000 1.256 1.256 1.256 1.256 2.733 2.462 2.010 1.256 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 54.600 54.600 54.600 54.600 54.600 0.000 0.000 0.000 0.000 0.000 0.000 1177.000 1177.000 1177.000 1177.000 1177.000 1177.000 0.000 0.000 0.000 0.000 0.000 0.000 33.500 31.500 33.000 28.000 16.000 14.500 11.000 12.500 24.567 23.060 23.183 20.100 0.000 0.000 0.000 0.283 6.567 5.060 5.183 2.383 4.633 1.080 1.400 4.417 0.000 0.000 0.000 0.000 4.633 1.080 1.400 4.417 0.000 0.000 0.000 0.000 0 0.000 2.000 0.000
29 2599.000 3451151.000 4679828.000 5371525.000 16.130 32.208 32.357 32.967 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 9.000 7.201 7.072 6.985 2.000 2.735 2.861 2.746 0 1244.000 1758.000 1868.000 0 1072.000 1569.000 1610.000 0 2.000 4.000 7.000 0 43.000 56.000 50.000 0 0.000 0.000 1.000 0.000 420.000 577.000 574.000 0 20.000 36.000 42.000 1.000 1025.000 1430.000 1623.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 0 44.000 46.000 69.000 0 1.000 0.000 1.000 0 0.000 0.000 0.000 1.013 4.265 4.265 4.265 3.181 4.265 4.265 0.000 3.615 0.000 0.000 0.000 3.542 4.265 4.265 4.265 6.581 9.281 8.359 6.824 0.001 0.054 0.054 0.054 0.036 0.054 0.054 0.000 0.043 0.000 0.000 0.000 0.042 0.054 0.054 0.054 0.072 0.118 0.106 0.086 0.256 0.633 0.633 0.633 0.507 0.633 0.633 0.000 0.558 0.000 0.000 0.000 0.549 0.633 0.633 0.633 1.082 1.377 1.241 1.013 0.213 1.854 1.854 1.854 1.307 1.854 1.854 0.000 1.526 0.000 0.000 0.000 1.489 1.854 1.854 1.854 2.634 4.034 3.634 2.966 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.026 0.332 0.332 0.332 0.230 0.332 0.332 0.000 0.271 0.000 0.000 0.000 0.264 0.332 0.332 0.332 0.459 0.722 0.651 0.531 0.015 0.137 0.137 0.137 0.096 0.137 0.137 0.000 0.113 0.000 0.000 0.000 0.110 0.137 0.137 0.137 0.194 0.298 0.269 0.219 0.503 1.256 1.256 1.256 1.005 1.256 1.256 0.000 1.105 0.000 0.000 0.000 1.089 1.256 1.256 1.256 2.143 2.733 2.462 2.010 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 59.987 54.600 54.600 54.600 56.395 54.600 54.600 0.000 55.677 0.000 0.000 0.000 1206.000 1177.000 1177.000 1177.000 1186.667 1177.000 1177.000 0.000 1182.800 0.000 0.000 0.000 28.500 33.500 31.500 33.000 15.000 16.000 14.500 11.000 21.650 24.567 23.060 23.183 0.000 0.000 0.000 0.000 3.650 6.567 5.060 5.183 0.350 4.633 1.080 1.400 0.000 0.000 0.000 0.000 0.350 4.633 1.080 1.400 0.000 0.000 0.000 0.000 0 0.000 0.000 2.000
30 3875305.000 2599.000 3451151.000 4679828.000 35.972 16.130 32.208 32.357 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 5.599 9.000 7.201 7.072 2.790 2.000 2.735 2.861 1427 0.000 1244.000 1758.000 1326 0.000 1072.000 1569.000 5 0.000 2.000 4.000 48 0.000 43.000 56.000 1 0.000 0.000 0.000 525.000 0.000 420.000 577.000 36 0.000 20.000 36.000 1108.000 1.000 1025.000 1430.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 66 0.000 44.000 46.000 1 0.000 1.000 0.000 0 0.000 0.000 0.000 1.013 1.013 4.265 4.265 1.013 3.181 4.265 4.265 1.013 3.615 0.000 0.000 1.939 3.542 4.265 4.265 3.057 6.581 9.281 8.359 0.001 0.001 0.054 0.054 0.001 0.036 0.054 0.054 0.001 0.043 0.000 0.000 0.016 0.042 0.054 0.054 0.011 0.072 0.118 0.106 0.256 0.256 0.633 0.633 0.256 0.507 0.633 0.633 0.256 0.558 0.000 0.000 0.363 0.549 0.633 0.633 0.697 1.082 1.377 1.241 0.213 0.213 1.854 1.854 0.213 1.307 1.854 1.854 0.213 1.526 0.000 0.000 0.680 1.489 1.854 1.854 0.805 2.634 4.034 3.634 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.026 0.026 0.332 0.332 0.026 0.230 0.332 0.332 0.026 0.271 0.000 0.000 0.113 0.264 0.332 0.332 0.116 0.459 0.722 0.651 0.015 0.015 0.137 0.137 0.015 0.096 0.137 0.137 0.015 0.113 0.000 0.000 0.050 0.110 0.137 0.137 0.058 0.194 0.298 0.269 0.503 0.503 1.256 1.256 0.503 1.005 1.256 1.256 0.503 1.105 0.000 0.000 0.717 1.089 1.256 1.256 1.372 2.143 2.733 2.462 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 59.987 59.987 54.600 54.600 59.987 56.395 54.600 54.600 59.987 55.677 0.000 0.000 1206.000 1206.000 1177.000 1177.000 1206.000 1186.667 1177.000 1177.000 1206.000 1182.800 0.000 0.000 32.000 28.500 33.500 31.500 17.500 15.000 16.000 14.500 24.460 21.650 24.567 23.060 0.000 0.000 0.000 0.000 6.460 3.650 6.567 5.060 12.120 0.350 4.633 1.080 0.000 0.000 0.000 0.000 12.120 0.350 4.633 1.080 0.000 0.000 0.000 0.000 0 0.000 0.000 0.000
31 4190321.000 3875305.000 2599.000 3451151.000 35.737 35.972 16.130 32.208 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 5.577 5.599 9.000 7.201 2.897 2.790 2.000 2.735 1655 1427.000 0.000 1244.000 1508 1326.000 0.000 1072.000 7 5.000 0.000 2.000 53 48.000 0.000 43.000 3 1.000 0.000 0.000 609.000 525.000 0.000 420.000 35 36.000 0.000 20.000 1215.000 1108.000 1.000 1025.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 81 66.000 0.000 44.000 0 1.000 0.000 1.000 0 0.000 0.000 0.000 24.064 1.013 1.013 4.265 8.697 1.013 3.181 4.265 5.623 1.013 3.615 0.000 6.855 1.939 3.542 4.265 25.898 3.057 6.581 9.281 0.970 0.001 0.001 0.054 0.324 0.001 0.036 0.054 0.195 0.001 0.043 0.000 0.228 0.016 0.042 0.054 0.977 0.011 0.072 0.118 0.339 0.256 0.256 0.633 0.284 0.256 0.507 0.633 0.273 0.256 0.558 0.000 0.358 0.363 0.549 0.633 0.757 0.697 1.082 1.377 15.697 0.213 0.213 1.854 5.374 0.213 1.307 1.854 3.310 0.213 1.526 0.000 4.017 0.680 1.489 1.854 16.180 0.805 2.634 4.034 0.153 0.000 0.000 0.000 0.051 0.000 0.000 0.000 0.031 0.000 0.000 0.000 0.034 0.000 0.000 0.000 0.153 0.000 0.000 0.000 4.095 0.026 0.026 0.332 1.382 0.026 0.230 0.332 0.840 0.026 0.271 0.000 0.998 0.113 0.264 0.332 4.165 0.116 0.459 0.722 1.260 0.015 0.015 0.137 0.430 0.015 0.096 0.137 0.264 0.015 0.113 0.000 0.319 0.050 0.110 0.137 1.295 0.058 0.194 0.298 1.551 0.503 0.503 1.256 0.852 0.503 1.005 1.256 0.713 0.503 1.105 0.000 0.903 0.717 1.089 1.256 2.374 1.372 2.143 2.733 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 46.925 59.987 59.987 54.600 55.633 59.987 56.395 54.600 57.375 59.987 55.677 0.000 1101.000 1206.000 1206.000 1177.000 1171.000 1206.000 1186.667 1177.000 1185.000 1206.000 1182.800 0.000 32.500 32.000 28.500 33.500 9.000 17.500 15.000 16.000 19.240 24.460 21.650 24.567 1.280 0.000 0.000 0.000 2.520 6.460 3.650 6.567 0.960 12.120 0.350 4.633 0.000 0.000 0.000 0.000 0.960 12.120 0.350 4.633 0.000 0.000 0.000 0.000 0 0.000 0.000 0.000
32 3740780.000 4190321.000 3875305.000 2599.000 35.231 35.737 35.972 16.130 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 6.246 5.577 5.599 9.000 2.696 2.897 2.790 2.000 1459 1655.000 1427.000 0.000 1336 1508.000 1326.000 0.000 7 7.000 5.000 0.000 52 53.000 48.000 0.000 4 3.000 1.000 0.000 532.000 609.000 525.000 0.000 32 35.000 36.000 0.000 1110.000 1215.000 1108.000 1.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 45 81.000 66.000 0.000 1 0.000 1.000 0.000 0 0.000 0.000 0.000 24.064 24.064 1.013 1.013 16.380 8.697 1.013 3.181 10.233 5.623 1.013 3.615 10.680 6.855 1.939 3.542 39.603 25.898 3.057 6.581 0.970 0.970 0.001 0.001 0.647 0.324 0.001 0.036 0.389 0.195 0.001 0.043 0.393 0.228 0.016 0.042 1.556 0.977 0.011 0.072 0.339 0.339 0.256 0.256 0.311 0.284 0.256 0.507 0.289 0.273 0.256 0.558 0.354 0.358 0.363 0.549 0.793 0.757 0.697 1.082 15.697 15.697 0.213 0.213 10.536 5.374 0.213 1.307 6.407 3.310 0.213 1.526 6.613 4.017 0.680 1.489 25.405 16.180 0.805 2.634 0.153 0.153 0.000 0.000 0.102 0.051 0.000 0.000 0.061 0.031 0.000 0.000 0.060 0.034 0.000 0.000 0.245 0.153 0.000 0.000 4.095 4.095 0.026 0.026 2.739 1.382 0.026 0.230 1.654 0.840 0.026 0.271 1.686 0.998 0.113 0.264 6.594 4.165 0.116 0.459 1.260 1.260 0.015 0.015 0.845 0.430 0.015 0.096 0.513 0.264 0.015 0.113 0.528 0.319 0.050 0.110 2.037 1.295 0.058 0.194 1.551 1.551 0.503 0.503 1.202 0.852 0.503 1.005 0.922 0.713 0.503 1.105 1.047 0.903 0.717 1.089 2.976 2.374 1.372 2.143 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 46.925 46.925 59.987 59.987 51.279 55.633 59.987 56.395 54.762 57.375 59.987 55.677 1101.000 1101.000 1206.000 1206.000 1136.000 1171.000 1206.000 1186.667 1164.000 1185.000 1206.000 1182.800 27.500 32.500 32.000 28.500 13.000 9.000 17.500 15.000 20.550 19.240 24.460 21.650 0.000 1.280 0.000 0.000 2.550 2.520 6.460 3.650 1.100 0.960 12.120 0.350 0.000 0.000 0.000 0.000 1.100 0.960 12.120 0.350 0.000 0.000 0.000 0.000 0 0.000 0.000 0.000
33 4212446.000 3740780.000 4190321.000 3875305.000 34.340 35.231 35.737 35.972 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 6.332 6.246 5.577 5.599 2.673 2.696 2.897 2.790 1745 1459.000 1655.000 1427.000 1686 1336.000 1508.000 1326.000 6 7.000 7.000 5.000 52 52.000 53.000 48.000 3 4.000 3.000 1.000 716.000 532.000 609.000 525.000 30 32.000 35.000 36.000 1276.000 1110.000 1215.000 1108.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 69 45.000 81.000 66.000 0 1.000 0.000 1.000 0 0.000 0.000 0.000 24.064 24.064 24.064 1.013 24.064 16.380 8.697 1.013 14.844 10.233 5.623 1.013 13.654 10.680 6.855 1.939 47.826 39.603 25.898 3.057 0.970 0.970 0.970 0.001 0.970 0.647 0.324 0.001 0.582 0.389 0.195 0.001 0.521 0.393 0.228 0.016 1.904 1.556 0.977 0.011 0.339 0.339 0.339 0.256 0.339 0.311 0.284 0.256 0.306 0.289 0.273 0.256 0.350 0.354 0.358 0.363 0.815 0.793 0.757 0.697 15.697 15.697 15.697 0.213 15.697 10.536 5.374 0.213 9.503 6.407 3.310 0.213 8.631 6.613 4.017 0.680 30.940 25.405 16.180 0.805 0.153 0.153 0.153 0.000 0.153 0.102 0.051 0.000 0.092 0.061 0.031 0.000 0.081 0.060 0.034 0.000 0.300 0.245 0.153 0.000 4.095 4.095 4.095 0.026 4.095 2.739 1.382 0.026 2.467 1.654 0.840 0.026 2.221 1.686 0.998 0.113 8.051 6.594 4.165 0.116 1.260 1.260 1.260 0.015 1.260 0.845 0.430 0.015 0.762 0.513 0.264 0.015 0.691 0.528 0.319 0.050 2.482 2.037 1.295 0.058 1.551 1.551 1.551 0.503 1.551 1.202 0.852 0.503 1.132 0.922 0.713 0.503 1.159 1.047 0.903 0.717 3.336 2.976 2.374 1.372 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 46.925 46.925 46.925 59.987 46.925 51.279 55.633 59.987 52.150 54.762 57.375 59.987 1101.000 1101.000 1101.000 1206.000 1101.000 1136.000 1171.000 1206.000 1143.000 1164.000 1185.000 1206.000 25.500 27.500 32.500 32.000 14.500 13.000 9.000 17.500 20.000 20.550 19.240 24.460 0.000 0.000 1.280 0.000 2.000 2.550 2.520 6.460 0.000 1.100 0.960 12.120 0.000 0.000 0.000 0.000 0.000 1.100 0.960 12.120 0.000 0.000 0.000 0.000 0 0.000 0.000 0.000
34 4149262.000 4212446.000 3740780.000 4190321.000 33.715 34.340 35.231 35.737 0.004 0.000 0.000 0.000 0.005 0.000 0.000 0.000 6.394 6.332 6.246 5.577 2.515 2.673 2.696 2.897 1653 1745.000 1459.000 1655.000 1519 1686.000 1336.000 1508.000 4 6.000 7.000 7.000 52 52.000 52.000 53.000 1 3.000 4.000 3.000 644.000 716.000 532.000 609.000 30 30.000 32.000 35.000 1262.000 1276.000 1110.000 1215.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 58 69.000 45.000 81.000 1 0.000 1.000 0.000 0 0.000 0.000 0.000 24.064 24.064 24.064 24.064 24.064 24.064 16.380 8.697 19.454 14.844 10.233 5.623 15.967 13.654 10.680 6.855 52.759 47.826 39.603 25.898 0.970 0.970 0.970 0.970 0.970 0.970 0.647 0.324 0.776 0.582 0.389 0.195 0.621 0.521 0.393 0.228 2.112 1.904 1.556 0.977 0.339 0.339 0.339 0.339 0.339 0.339 0.311 0.284 0.322 0.306 0.289 0.273 0.348 0.350 0.354 0.358 0.828 0.815 0.793 0.757 15.697 15.697 15.697 15.697 15.697 15.697 10.536 5.374 12.600 9.503 6.407 3.310 10.202 8.631 6.613 4.017 34.261 30.940 25.405 16.180 0.153 0.153 0.153 0.153 0.153 0.153 0.102 0.051 0.122 0.092 0.061 0.031 0.097 0.081 0.060 0.034 0.333 0.300 0.245 0.153 4.095 4.095 4.095 4.095 4.095 4.095 2.739 1.382 3.281 2.467 1.654 0.840 2.638 2.221 1.686 0.998 8.926 8.051 6.594 4.165 1.260 1.260 1.260 1.260 1.260 1.260 0.845 0.430 1.011 0.762 0.513 0.264 0.817 0.691 0.528 0.319 2.749 2.482 2.037 1.295 1.551 1.551 1.551 1.551 1.551 1.551 1.202 0.852 1.341 1.132 0.922 0.713 1.246 1.159 1.047 0.903 3.553 3.336 2.976 2.374 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 46.925 46.925 46.925 46.925 46.925 46.925 51.279 55.633 49.538 52.150 54.762 57.375 1101.000 1101.000 1101.000 1101.000 1101.000 1101.000 1136.000 1171.000 1122.000 1143.000 1164.000 1185.000 26.500 25.500 27.500 32.500 8.000 14.500 13.000 9.000 17.725 20.000 20.550 19.240 1.925 0.000 0.000 1.280 1.650 2.000 2.550 2.520 2.450 0.000 1.100 0.960 0.000 0.000 0.000 0.000 2.450 0.000 1.100 0.960 0.000 0.000 0.000 0.000 0 0.000 0.000 0.000
In [348]:
# Checking for total count and percentage of null values in all columns of the dataframe.

total = pd.DataFrame(homeaudio_dladd_df.isnull().sum().sort_values(ascending=False), columns=['Total'])
percentage = pd.DataFrame(round(100*(homeaudio_dladd_df.isnull().sum()/homeaudio_dladd_df.shape[0]),2).sort_values(ascending=False)\
                          ,columns=['Percentage'])

pd.concat([total, percentage], axis = 1).head()
Out[348]:
Total Percentage
Sale_lag3 0 0.000
TV_Ad_Stock_lag1 0 0.000
TV_SMA_5_lag1 0 0.000
TV_SMA_5_lag2 0 0.000
TV_SMA_5_lag3 0 0.000

Rescaling the Features of the 3 Dataframes

We will use Standard scaling.

In [349]:
from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()

cameraaccessory_dladd_df[cameraaccessory_dladd_df.columns]=scaler.fit_transform(cameraaccessory_dladd_df[cameraaccessory_dladd_df.columns])
gamingaccessory_dladd_df[gamingaccessory_dladd_df.columns]=scaler.fit_transform(gamingaccessory_dladd_df[gamingaccessory_dladd_df.columns])
homeaudio_dladd_df[homeaudio_dladd_df.columns]=scaler.fit_transform(homeaudio_dladd_df[homeaudio_dladd_df.columns])

homeaudio_dladd_df.head()
Out[349]:
gmv gmv_lag1 gmv_lag2 gmv_lag3 Discount% Discount%_lag1 Discount%_lag2 Discount%_lag3 deliverybdays deliverybdays_lag1 deliverybdays_lag2 deliverybdays_lag3 deliverycdays deliverycdays_lag1 deliverycdays_lag2 deliverycdays_lag3 sla sla_lag1 sla_lag2 sla_lag3 product_procurement_sla product_procurement_sla_lag1 product_procurement_sla_lag2 product_procurement_sla_lag3 is_cod is_cod_lag1 is_cod_lag2 is_cod_lag3 is_mass_market is_mass_market_lag1 is_mass_market_lag2 is_mass_market_lag3 product_vertical_djcontroller product_vertical_djcontroller_lag1 product_vertical_djcontroller_lag2 product_vertical_djcontroller_lag3 product_vertical_dock product_vertical_dock_lag1 product_vertical_dock_lag2 product_vertical_dock_lag3 product_vertical_dockingstation product_vertical_dockingstation_lag1 product_vertical_dockingstation_lag2 product_vertical_dockingstation_lag3 product_vertical_fmradio product_vertical_fmradio_lag1 product_vertical_fmradio_lag2 product_vertical_fmradio_lag3 product_vertical_hifisystem product_vertical_hifisystem_lag1 product_vertical_hifisystem_lag2 product_vertical_hifisystem_lag3 product_vertical_homeaudiospeaker product_vertical_homeaudiospeaker_lag1 product_vertical_homeaudiospeaker_lag2 product_vertical_homeaudiospeaker_lag3 product_vertical_karaokeplayer product_vertical_karaokeplayer_lag1 product_vertical_karaokeplayer_lag2 product_vertical_karaokeplayer_lag3 product_vertical_slingbox product_vertical_slingbox_lag1 product_vertical_slingbox_lag2 product_vertical_slingbox_lag3 product_vertical_soundmixer product_vertical_soundmixer_lag1 product_vertical_soundmixer_lag2 product_vertical_soundmixer_lag3 product_vertical_voicerecorder product_vertical_voicerecorder_lag1 product_vertical_voicerecorder_lag2 product_vertical_voicerecorder_lag3 payday_week payday_week_lag1 payday_week_lag2 payday_week_lag3 holiday_week holiday_week_lag1 holiday_week_lag2 holiday_week_lag3 Total Investment Total Investment_lag1 Total Investment_lag2 Total Investment_lag3 Total Investment_SMA_3 Total Investment_SMA_3_lag1 Total Investment_SMA_3_lag2 Total Investment_SMA_3_lag3 Total Investment_SMA_5 Total Investment_SMA_5_lag1 Total Investment_SMA_5_lag2 Total Investment_SMA_5_lag3 Total Investment_EMA_8 Total Investment_EMA_8_lag1 Total Investment_EMA_8_lag2 Total Investment_EMA_8_lag3 Total_Investment_Ad_Stock Total_Investment_Ad_Stock_lag1 Total_Investment_Ad_Stock_lag2 Total_Investment_Ad_Stock_lag3 TV TV_lag1 TV_lag2 TV_lag3 TV_SMA_3 TV_SMA_3_lag1 TV_SMA_3_lag2 TV_SMA_3_lag3 TV_SMA_5 TV_SMA_5_lag1 TV_SMA_5_lag2 TV_SMA_5_lag3 TV_EMA_8 TV_EMA_8_lag1 TV_EMA_8_lag2 TV_EMA_8_lag3 TV_Ad_Stock TV_Ad_Stock_lag1 TV_Ad_Stock_lag2 TV_Ad_Stock_lag3 Digital Digital_lag1 Digital_lag2 Digital_lag3 Digital_SMA_3 Digital_SMA_3_lag1 Digital_SMA_3_lag2 Digital_SMA_3_lag3 Digital_SMA_5 Digital_SMA_5_lag1 Digital_SMA_5_lag2 Digital_SMA_5_lag3 Digital_EMA_8 Digital_EMA_8_lag1 Digital_EMA_8_lag2 Digital_EMA_8_lag3 Digital_Ad_Stock Digital_Ad_Stock_lag1 Digital_Ad_Stock_lag2 Digital_Ad_Stock_lag3 Sponsorship Sponsorship_lag1 Sponsorship_lag2 Sponsorship_lag3 Sponsorship_SMA_3 Sponsorship_SMA_3_lag1 Sponsorship_SMA_3_lag2 Sponsorship_SMA_3_lag3 Sponsorship_SMA_5 Sponsorship_SMA_5_lag1 Sponsorship_SMA_5_lag2 Sponsorship_SMA_5_lag3 Sponsorship_EMA_8 Sponsorship_EMA_8_lag1 Sponsorship_EMA_8_lag2 Sponsorship_EMA_8_lag3 Sponsorship_Ad_Stock Sponsorship_Ad_Stock_lag1 Sponsorship_Ad_Stock_lag2 Sponsorship_Ad_Stock_lag3 Content Marketing Content Marketing_lag1 Content Marketing_lag2 Content Marketing_lag3 Content Marketing_SMA_3 Content Marketing_SMA_3_lag1 Content Marketing_SMA_3_lag2 Content Marketing_SMA_3_lag3 Content Marketing_SMA_5 Content Marketing_SMA_5_lag1 Content Marketing_SMA_5_lag2 Content Marketing_SMA_5_lag3 Content Marketing_EMA_8 Content Marketing_EMA_8_lag1 Content Marketing_EMA_8_lag2 Content Marketing_EMA_8_lag3 Content_Marketing_Ad_Stock Content_Marketing_Ad_Stock_lag1 Content_Marketing_Ad_Stock_lag2 Content_Marketing_Ad_Stock_lag3 Online marketing Online marketing_lag1 Online marketing_lag2 Online marketing_lag3 Online marketing_SMA_3 Online marketing_SMA_3_lag1 Online marketing_SMA_3_lag2 Online marketing_SMA_3_lag3 Online marketing_SMA_5 Online marketing_SMA_5_lag1 Online marketing_SMA_5_lag2 Online marketing_SMA_5_lag3 Online marketing_EMA_8 Online marketing_EMA_8_lag1 Online marketing_EMA_8_lag2 Online marketing_EMA_8_lag3 Online_marketing_Ad_Stock Online_marketing_Ad_Stock_lag1 Online_marketing_Ad_Stock_lag2 Online_marketing_Ad_Stock_lag3 Affiliates Affiliates_lag1 Affiliates_lag2 Affiliates_lag3 Affiliates_SMA_3 Affiliates_SMA_3_lag1 Affiliates_SMA_3_lag2 Affiliates_SMA_3_lag3 Affiliates_SMA_5 Affiliates_SMA_5_lag1 Affiliates_SMA_5_lag2 Affiliates_SMA_5_lag3 Affiliates_EMA_8 Affiliates_EMA_8_lag1 Affiliates_EMA_8_lag2 Affiliates_EMA_8_lag3 Affiliates_Ad_Stock Affiliates_Ad_Stock_lag1 Affiliates_Ad_Stock_lag2 Affiliates_Ad_Stock_lag3 SEM SEM_lag1 SEM_lag2 SEM_lag3 SEM_SMA_3 SEM_SMA_3_lag1 SEM_SMA_3_lag2 SEM_SMA_3_lag3 SEM_SMA_5 SEM_SMA_5_lag1 SEM_SMA_5_lag2 SEM_SMA_5_lag3 SEM_EMA_8 SEM_EMA_8_lag1 SEM_EMA_8_lag2 SEM_EMA_8_lag3 SEM_Ad_Stock SEM_Ad_Stock_lag1 SEM_Ad_Stock_lag2 SEM_Ad_Stock_lag3 Radio Radio_lag1 Radio_lag2 Radio_lag3 Radio_SMA_3 Radio_SMA_3_lag1 Radio_SMA_3_lag2 Radio_SMA_3_lag3 Radio_SMA_5 Radio_SMA_5_lag1 Radio_SMA_5_lag2 Radio_SMA_5_lag3 Radio_EMA_8 Radio_EMA_8_lag1 Radio_EMA_8_lag2 Radio_EMA_8_lag3 Radio_Ad_Stock Radio_Ad_Stock_lag1 Radio_Ad_Stock_lag2 Radio_Ad_Stock_lag3 Other Other_lag1 Other_lag2 Other_lag3 Other_SMA_3 Other_SMA_3_lag1 Other_SMA_3_lag2 Other_SMA_3_lag3 Other_SMA_5 Other_SMA_5_lag1 Other_SMA_5_lag2 Other_SMA_5_lag3 Other_EMA_8 Other_EMA_8_lag1 Other_EMA_8_lag2 Other_EMA_8_lag3 Other_Ad_Stock Other_Ad_Stock_lag1 Other_Ad_Stock_lag2 Other_Ad_Stock_lag3 NPS NPS_lag1 NPS_lag2 NPS_lag3 NPS_SMA_3 NPS_SMA_3_lag1 NPS_SMA_3_lag2 NPS_SMA_3_lag3 NPS_SMA_5 NPS_SMA_5_lag1 NPS_SMA_5_lag2 NPS_SMA_5_lag3 Stock Index Stock Index_lag1 Stock Index_lag2 Stock Index_lag3 Stock Index_SMA_3 Stock Index_SMA_3_lag1 Stock Index_SMA_3_lag2 Stock Index_SMA_3_lag3 Stock Index_SMA_5 Stock Index_SMA_5_lag1 Stock Index_SMA_5_lag2 Stock Index_SMA_5_lag3 Max Temp Max Temp_lag1 Max Temp_lag2 Max Temp_lag3 Min Temp Min Temp_lag1 Min Temp_lag2 Min Temp_lag3 Mean Temp Mean Temp_lag1 Mean Temp_lag2 Mean Temp_lag3 Heat Deg Days Heat Deg Days_lag1 Heat Deg Days_lag2 Heat Deg Days_lag3 Cool Deg Days Cool Deg Days_lag1 Cool Deg Days_lag2 Cool Deg Days_lag3 Total Rain (mm) Total Rain (mm)_lag1 Total Rain (mm)_lag2 Total Rain (mm)_lag3 Total Snow (cm) Total Snow (cm)_lag1 Total Snow (cm)_lag2 Total Snow (cm)_lag3 Total Precip (mm) Total Precip (mm)_lag1 Total Precip (mm)_lag2 Total Precip (mm)_lag3 Snow on Grnd (cm) Snow on Grnd (cm)_lag1 Snow on Grnd (cm)_lag2 Snow on Grnd (cm)_lag3 Sale Sale_lag1 Sale_lag2 Sale_lag3
25 -0.107 -1.832 -1.751 -1.672 -0.920 -5.106 -4.055 -3.452 -0.639 -0.607 -0.576 -0.544 -0.639 -0.608 -0.577 -0.545 1.882 -4.837 -3.947 -3.374 1.297 -6.330 -4.599 -3.763 -0.132 -1.810 -1.734 -1.657 -0.178 -1.853 -1.781 -1.702 1.670 -1.507 -1.507 -1.450 -0.198 -1.509 -1.487 -1.452 -1.132 -1.266 -1.212 -1.162 0.062 -2.605 -2.450 -2.267 -0.420 -2.055 -1.995 -1.945 -0.206 -1.737 -1.669 -1.604 -0.146 -0.146 -0.146 -0.146 -0.197 -0.197 -0.197 -0.197 -0.542 -0.525 -0.525 -0.493 -0.233 -2.354 -2.207 -2.062 -1.043 -1.000 -1.000 -0.959 -0.545 -0.545 -0.513 -0.480 -1.229 -1.576 -1.522 -1.472 -1.739 -1.671 -1.609 -1.546 -1.806 -1.727 -1.651 -1.579 -1.814 -2.281 -2.135 -2.006 -1.789 -1.895 -1.807 -1.724 -1.331 -1.382 -1.350 -1.319 -1.512 -1.472 -1.435 -1.402 -1.593 -1.550 -1.512 -1.479 -1.803 -1.844 -1.770 -1.700 -1.633 -1.624 -1.578 -1.536 0.042 -0.744 -0.731 -0.718 -0.831 -0.815 -0.800 -0.788 -0.893 -0.876 -0.862 -0.849 0.061 -1.346 -1.316 -1.288 -0.563 -0.977 -0.958 -0.942 -0.922 -1.182 -1.146 -1.111 -1.326 -1.280 -1.237 -1.196 -1.378 -1.327 -1.279 -1.234 -1.404 -1.801 -1.711 -1.630 -1.357 -1.450 -1.394 -1.342 -0.718 -0.716 -0.715 -0.713 -0.821 -0.819 -0.817 -0.808 -0.946 -0.937 -0.923 -0.904 -1.204 -1.186 -1.164 -1.140 -0.952 -0.946 -0.937 -0.925 -2.079 -2.161 -2.063 -1.975 -2.266 -2.155 -2.057 -1.946 -2.221 -2.097 -1.977 -1.863 -2.172 -2.224 -2.082 -1.954 -2.283 -2.229 -2.103 -1.983 -2.144 -2.275 -2.156 -2.052 -2.346 -2.218 -2.106 -1.988 -2.246 -2.117 -1.995 -1.881 -2.136 -2.250 -2.104 -1.974 -2.313 -2.273 -2.139 -2.015 -0.308 -0.946 -0.929 -0.913 -1.054 -1.034 -1.014 -0.992 -1.129 -1.103 -1.076 -1.049 -0.519 -1.611 -1.555 -1.501 -0.896 -1.213 -1.183 -1.154 -0.476 -0.476 -0.476 -0.476 -0.584 -0.584 -0.584 -0.571 -0.677 -0.667 -0.650 -0.628 -0.845 -0.826 -0.804 -0.777 -0.686 -0.680 -0.670 -0.655 -0.456 -0.456 -0.456 -0.456 -0.555 -0.555 -0.555 -0.550 -0.652 -0.647 -0.640 -0.628 -0.814 -0.804 -0.790 -0.774 -0.657 -0.654 -0.650 -0.642 1.550 -6.118 -4.511 -3.712 -4.560 -3.740 -3.229 -2.869 -3.235 -2.873 -2.600 -2.383 0.157 -6.178 -4.536 -3.726 -4.601 -3.764 -3.245 -2.881 -3.265 -2.896 -2.618 -2.398 0.776 -2.073 -1.963 -1.861 1.118 -0.281 -0.253 -0.234 0.959 -1.300 -1.251 -1.200 -1.035 -1.075 -1.075 -1.073 0.461 -0.613 -0.585 -0.554 0.430 -0.811 -0.805 -0.795 -0.368 -0.368 -0.368 -0.368 0.324 -0.833 -0.828 -0.818 -0.275 -0.275 -0.275 -0.275 -0.435 -0.435 -0.435 -0.435
26 0.203 -0.092 -1.751 -1.672 -0.594 -0.528 -4.055 -3.452 -0.639 -0.607 -0.576 -0.544 -0.639 -0.608 -0.577 -0.545 1.437 1.463 -3.947 -3.374 0.560 0.630 -4.599 -3.763 0.179 -0.116 -1.734 -1.657 0.130 -0.163 -1.781 -1.702 1.267 1.655 -1.507 -1.450 0.481 -0.195 -1.487 -1.452 -1.132 -1.092 -1.212 -1.162 0.374 0.075 -2.450 -2.267 0.967 -0.404 -1.995 -1.945 0.079 -0.190 -1.669 -1.604 -0.146 -0.146 -0.146 -0.146 -0.197 -0.197 -0.197 -0.197 -0.542 -0.525 -0.525 -0.493 -0.021 -0.205 -2.207 -2.062 0.959 -1.000 -1.000 -0.959 -0.545 -0.545 -0.513 -0.480 -1.229 -1.180 -1.522 -1.472 -1.739 -1.671 -1.609 -1.546 -1.806 -1.727 -1.651 -1.579 -1.814 -1.680 -2.135 -2.006 -1.667 -1.699 -1.807 -1.724 -1.331 -1.299 -1.350 -1.319 -1.512 -1.472 -1.435 -1.402 -1.593 -1.550 -1.512 -1.479 -1.803 -1.728 -1.770 -1.700 -1.608 -1.584 -1.578 -1.536 0.042 0.051 -0.731 -0.718 -0.831 -0.815 -0.800 -0.788 -0.893 -0.876 -0.862 -0.849 0.061 0.074 -1.316 -1.288 -0.303 -0.547 -0.958 -0.942 -0.922 -0.888 -1.146 -1.111 -1.326 -1.280 -1.237 -1.196 -1.378 -1.327 -1.279 -1.234 -1.404 -1.322 -1.711 -1.630 -1.265 -1.300 -1.394 -1.342 -0.718 -0.716 -0.715 -0.713 -0.821 -0.819 -0.817 -0.808 -0.946 -0.937 -0.923 -0.904 -1.204 -1.186 -1.164 -1.140 -0.952 -0.946 -0.937 -0.925 -2.079 -1.976 -2.063 -1.975 -2.266 -2.155 -2.057 -1.946 -2.221 -2.097 -1.977 -1.863 -2.172 -2.022 -2.082 -1.954 -2.234 -2.151 -2.103 -1.983 -2.144 -2.020 -2.156 -2.052 -2.346 -2.218 -2.106 -1.988 -2.246 -2.117 -1.995 -1.881 -2.136 -1.984 -2.104 -1.974 -2.248 -2.169 -2.139 -2.015 -0.308 -0.296 -0.929 -0.913 -1.054 -1.034 -1.014 -0.992 -1.129 -1.103 -1.076 -1.049 -0.519 -0.487 -1.555 -1.501 -0.687 -0.870 -1.183 -1.154 -0.476 -0.476 -0.476 -0.476 -0.584 -0.584 -0.584 -0.571 -0.677 -0.667 -0.650 -0.628 -0.845 -0.826 -0.804 -0.777 -0.686 -0.680 -0.670 -0.655 -0.456 -0.456 -0.456 -0.456 -0.555 -0.555 -0.555 -0.550 -0.652 -0.647 -0.640 -0.628 -0.814 -0.804 -0.790 -0.774 -0.657 -0.654 -0.650 -0.642 1.550 0.834 -4.511 -3.712 -4.560 -3.740 -3.229 -2.869 -3.235 -2.873 -2.600 -2.383 0.157 0.203 -4.536 -3.726 -4.601 -3.764 -3.245 -2.881 -3.265 -2.896 -2.618 -2.398 1.308 0.832 -1.963 -1.861 0.947 1.166 -0.253 -0.234 1.314 1.022 -1.251 -1.200 -1.075 -1.035 -1.075 -1.073 1.755 0.553 -0.585 -0.554 -0.424 0.434 -0.805 -0.795 -0.368 -0.368 -0.368 -0.368 -0.472 0.329 -0.828 -0.818 -0.275 -0.275 -0.275 -0.275 0.870 -0.435 -0.435 -0.435
27 -0.066 0.212 -0.063 -1.672 -0.725 -0.307 -0.351 -3.452 -0.639 -0.607 -0.576 -0.544 -0.639 -0.608 -0.577 -0.545 1.538 1.135 1.277 -3.374 1.280 0.345 0.563 -3.763 0.059 0.189 -0.089 -1.657 0.078 0.139 -0.137 -1.702 0.059 1.260 1.655 -1.450 0.721 0.482 -0.186 -1.452 -1.310 -1.092 -1.039 -1.162 0.390 0.376 0.098 -2.267 0.529 0.960 -0.382 -1.945 -0.142 0.090 -0.163 -1.604 -0.146 -0.146 -0.146 -0.146 -0.197 -0.197 -0.197 -0.197 -0.542 -0.525 -0.525 -0.493 -0.835 -0.001 -0.162 -2.062 -1.043 1.000 -1.000 -0.959 -0.545 -0.545 -0.513 -0.480 -1.229 -1.180 -1.135 -1.472 -1.299 -1.671 -1.609 -1.546 -1.806 -1.727 -1.651 -1.579 -1.814 -1.680 -1.562 -2.006 -1.594 -1.582 -1.617 -1.724 -1.331 -1.299 -1.267 -1.319 -1.420 -1.472 -1.435 -1.402 -1.593 -1.550 -1.512 -1.479 -1.803 -1.728 -1.657 -1.700 -1.593 -1.560 -1.539 -1.536 0.042 0.051 0.060 -0.718 0.082 -0.815 -0.800 -0.788 -0.893 -0.876 -0.862 -0.849 0.061 0.074 0.086 -1.288 -0.147 -0.289 -0.532 -0.942 -0.922 -0.888 -0.855 -1.111 -0.994 -1.280 -1.237 -1.196 -1.378 -1.327 -1.279 -1.234 -1.404 -1.322 -1.247 -1.630 -1.210 -1.210 -1.248 -1.342 -0.718 -0.716 -0.715 -0.713 -0.821 -0.819 -0.817 -0.808 -0.946 -0.937 -0.923 -0.904 -1.204 -1.186 -1.164 -1.140 -0.952 -0.946 -0.937 -0.925 -2.079 -1.976 -1.885 -1.975 -2.071 -2.155 -2.057 -1.946 -2.221 -2.097 -1.977 -1.863 -2.172 -2.022 -1.889 -1.954 -2.205 -2.104 -2.028 -1.983 -2.144 -2.020 -1.912 -2.052 -2.083 -2.218 -2.106 -1.988 -2.246 -2.117 -1.995 -1.881 -2.136 -1.984 -1.851 -1.974 -2.208 -2.107 -2.040 -2.015 -0.308 -0.296 -0.284 -0.913 -0.317 -1.034 -1.014 -0.992 -1.129 -1.103 -1.076 -1.049 -0.519 -0.487 -0.456 -1.501 -0.563 -0.665 -0.845 -1.154 -0.476 -0.476 -0.476 -0.476 -0.584 -0.584 -0.584 -0.571 -0.677 -0.667 -0.650 -0.628 -0.845 -0.826 -0.804 -0.777 -0.686 -0.680 -0.670 -0.655 -0.456 -0.456 -0.456 -0.456 -0.555 -0.555 -0.555 -0.550 -0.652 -0.647 -0.640 -0.628 -0.814 -0.804 -0.790 -0.774 -0.657 -0.654 -0.650 -0.642 1.550 0.834 0.730 -3.712 0.743 -3.740 -3.229 -2.869 -3.235 -2.873 -2.600 -2.383 0.157 0.203 0.254 -3.726 0.255 -3.764 -3.245 -2.881 -3.265 -2.896 -2.618 -2.398 1.149 1.351 0.891 -1.861 1.347 0.992 1.216 -0.234 1.300 1.379 1.088 -1.200 -1.075 -1.075 -1.035 -1.073 1.698 1.923 0.653 -0.554 -0.515 -0.416 0.436 -0.795 -0.368 -0.368 -0.368 -0.368 -0.557 -0.465 0.331 -0.818 -0.275 -0.275 -0.275 -0.275 -0.435 0.870 -0.435 -0.435
28 -0.543 -0.052 0.231 -0.033 -0.757 -0.396 -0.172 -0.241 -0.639 -0.607 -0.576 -0.544 -0.639 -0.608 -0.577 -0.545 1.687 1.209 1.005 1.179 0.487 0.623 0.352 0.556 -0.503 0.071 0.208 -0.059 -0.548 0.088 0.157 -0.106 -0.747 0.074 1.260 1.643 0.202 0.721 0.485 -0.170 -1.310 -1.266 -1.039 -0.990 -0.455 0.392 0.384 0.138 -0.639 0.529 0.952 -0.363 -0.606 -0.127 0.110 -0.136 -0.146 -0.146 -0.146 -0.146 -0.197 -0.197 -0.197 -0.197 -0.542 -0.525 -0.525 -0.493 -0.906 -0.785 0.032 -0.092 0.959 -1.000 1.000 -0.959 -0.545 -0.545 -0.513 -0.480 -1.229 -1.180 -1.135 -1.092 -1.299 -1.243 -1.609 -1.546 -1.806 -1.727 -1.651 -1.579 -1.814 -1.680 -1.562 -1.458 -1.550 -1.512 -1.504 -1.540 -1.331 -1.299 -1.267 -1.237 -1.420 -1.382 -1.435 -1.402 -1.593 -1.550 -1.512 -1.479 -1.803 -1.728 -1.657 -1.590 -1.585 -1.546 -1.516 -1.497 0.042 0.051 0.060 0.069 0.082 0.092 -0.800 -0.788 -0.893 -0.876 -0.862 -0.849 0.061 0.074 0.086 0.097 -0.054 -0.134 -0.276 -0.518 -0.922 -0.888 -0.855 -0.824 -0.994 -0.954 -1.237 -1.196 -1.378 -1.327 -1.279 -1.234 -1.404 -1.322 -1.247 -1.179 -1.176 -1.156 -1.159 -1.198 -0.718 -0.716 -0.715 -0.713 -0.821 -0.819 -0.817 -0.808 -0.946 -0.937 -0.923 -0.904 -1.204 -1.186 -1.164 -1.140 -0.952 -0.946 -0.937 -0.925 -2.079 -1.976 -1.885 -1.802 -2.071 -1.968 -2.057 -1.946 -2.221 -2.097 -1.977 -1.863 -2.172 -2.022 -1.889 -1.770 -2.187 -2.076 -1.983 -1.912 -2.144 -2.020 -1.912 -1.817 -2.083 -1.966 -2.106 -1.988 -2.246 -2.117 -1.995 -1.881 -2.136 -1.984 -1.851 -1.731 -2.185 -2.070 -1.980 -1.920 -0.308 -0.296 -0.284 -0.272 -0.317 -0.303 -1.014 -0.992 -1.129 -1.103 -1.076 -1.049 -0.519 -0.487 -0.456 -0.425 -0.488 -0.542 -0.642 -0.820 -0.476 -0.476 -0.476 -0.476 -0.584 -0.584 -0.584 -0.571 -0.677 -0.667 -0.650 -0.628 -0.845 -0.826 -0.804 -0.777 -0.686 -0.680 -0.670 -0.655 -0.456 -0.456 -0.456 -0.456 -0.555 -0.555 -0.555 -0.550 -0.652 -0.647 -0.640 -0.628 -0.814 -0.804 -0.790 -0.774 -0.657 -0.654 -0.650 -0.642 1.550 0.834 0.730 0.700 0.743 0.709 -3.229 -2.869 -3.235 -2.873 -2.600 -2.383 0.157 0.203 0.254 0.298 0.255 0.299 -3.245 -2.881 -3.265 -2.896 -2.618 -2.398 1.361 1.195 1.400 0.945 1.518 1.397 1.039 1.243 1.473 1.364 1.447 1.140 -1.075 -1.075 -1.075 -1.033 2.395 1.863 2.107 0.704 0.491 -0.507 -0.412 0.441 -0.368 -0.368 -0.368 -0.368 0.382 -0.549 -0.460 0.336 -0.275 -0.275 -0.275 -0.275 -0.435 -0.435 0.870 -0.435
29 -1.882 -0.519 -0.024 0.253 -4.223 -0.418 -0.244 -0.086 -0.639 -0.607 -0.576 -0.544 -0.639 -0.608 -0.577 -0.545 3.768 1.319 1.066 0.941 -4.146 0.317 0.558 0.379 -1.862 -0.479 0.093 0.229 -1.900 -0.527 0.107 0.179 -1.552 -0.716 0.074 1.257 -1.517 0.203 0.721 0.490 -1.310 -1.266 -1.212 -0.990 -2.715 -0.424 0.399 0.408 -2.099 -0.619 0.531 0.945 -1.779 -0.583 -0.101 0.131 -0.146 -0.146 -0.146 -0.146 -0.197 -0.197 -0.197 -0.197 -0.542 -0.525 -0.525 -0.493 -2.465 -0.853 -0.714 0.095 -1.043 1.000 -1.000 1.043 -0.545 -0.545 -0.513 -0.480 -1.538 -1.180 -1.135 -1.092 -1.411 -1.243 -1.191 -1.546 -1.408 -1.727 -1.651 -1.579 -1.922 -1.680 -1.562 -1.458 -1.679 -1.469 -1.436 -1.430 -1.414 -1.299 -1.267 -1.237 -1.450 -1.382 -1.346 -1.402 -1.514 -1.550 -1.512 -1.479 -1.829 -1.728 -1.657 -1.590 -1.619 -1.537 -1.502 -1.474 -0.434 0.051 0.060 0.069 -0.100 0.092 0.102 -0.788 0.023 -0.876 -0.862 -0.849 -0.130 0.074 0.086 0.097 -0.255 -0.041 -0.122 -0.264 -1.186 -0.888 -0.855 -0.824 -1.092 -0.954 -0.916 -1.196 -1.081 -1.327 -1.279 -1.234 -1.502 -1.322 -1.247 -1.179 -1.292 -1.124 -1.107 -1.112 -0.718 -0.716 -0.715 -0.713 -0.821 -0.819 -0.817 -0.808 -0.946 -0.937 -0.923 -0.904 -1.204 -1.186 -1.164 -1.140 -0.952 -0.946 -0.937 -0.925 -2.256 -1.976 -1.885 -1.802 -2.131 -1.968 -1.876 -1.946 -2.063 -2.097 -1.977 -1.863 -2.215 -2.022 -1.889 -1.770 -2.252 -2.059 -1.956 -1.869 -2.381 -2.020 -1.912 -1.817 -2.161 -1.966 -1.864 -1.988 -2.036 -2.117 -1.995 -1.881 -2.191 -1.984 -1.851 -1.731 -2.268 -2.047 -1.944 -1.863 -0.701 -0.296 -0.284 -0.272 -0.464 -0.303 -0.289 -0.992 -0.404 -1.103 -1.076 -1.049 -0.672 -0.487 -0.456 -0.425 -0.651 -0.468 -0.521 -0.620 -0.476 -0.476 -0.476 -0.476 -0.584 -0.584 -0.584 -0.571 -0.677 -0.667 -0.650 -0.628 -0.845 -0.826 -0.804 -0.777 -0.686 -0.680 -0.670 -0.655 -0.456 -0.456 -0.456 -0.456 -0.555 -0.555 -0.555 -0.550 -0.652 -0.647 -0.640 -0.628 -0.814 -0.804 -0.790 -0.774 -0.657 -0.654 -0.650 -0.642 3.068 0.834 0.730 0.700 0.918 0.709 0.703 -2.869 0.780 -2.873 -2.600 -2.383 0.520 0.203 0.254 0.298 0.295 0.299 0.340 -2.881 0.356 -2.896 -2.618 -2.398 0.830 1.403 1.247 1.445 1.404 1.571 1.451 1.066 1.137 1.539 1.432 1.499 -1.075 -1.075 -1.075 -1.073 1.046 2.600 2.043 2.183 -0.721 0.495 -0.502 -0.403 -0.368 -0.368 -0.368 -0.368 -0.750 0.386 -0.544 -0.452 -0.275 -0.275 -0.275 -0.275 -0.435 -0.435 -0.435 0.870

Splitting the 3 Dataframes into Training and Testing Sets

As you know, the first basic step for regression is performing a train-test split.

In [350]:
from sklearn.model_selection import train_test_split

# We specify this so that the train and test data set always have the same rows, respectively

cameraaccessory_dladd_train, cameraaccessory_dladd_test = train_test_split(cameraaccessory_dladd_df, \
                                                               train_size = 0.7, test_size = 0.3, random_state = 100)

gamingaccessory_dladd_train, gamingaccessory_dladd_test = train_test_split(gamingaccessory_dladd_df, \
                                                               train_size = 0.7, test_size = 0.3, random_state = 100)

homeaudio_dladd_train, homeaudio_dladd_test = train_test_split(homeaudio_dladd_df, \
                                                               train_size = 0.7, test_size = 0.3, random_state = 100)

Dividing the 3 dataframes into X and Y sets for the model building

In [351]:
y_cameraaccessory_dladd_train = cameraaccessory_dladd_train.pop('gmv')
X_cameraaccessory_dladd_train = cameraaccessory_dladd_train

y_gamingaccessory_dladd_train = gamingaccessory_dladd_train.pop('gmv')
X_gamingaccessory_dladd_train = gamingaccessory_dladd_train

y_homeaudio_dladd_train = homeaudio_dladd_train.pop('gmv')
X_homeaudio_dladd_train = homeaudio_dladd_train

X_homeaudio_dladd_train.head()
Out[351]:
gmv_lag1 gmv_lag2 gmv_lag3 Discount% Discount%_lag1 Discount%_lag2 Discount%_lag3 deliverybdays deliverybdays_lag1 deliverybdays_lag2 deliverybdays_lag3 deliverycdays deliverycdays_lag1 deliverycdays_lag2 deliverycdays_lag3 sla sla_lag1 sla_lag2 sla_lag3 product_procurement_sla product_procurement_sla_lag1 product_procurement_sla_lag2 product_procurement_sla_lag3 is_cod is_cod_lag1 is_cod_lag2 is_cod_lag3 is_mass_market is_mass_market_lag1 is_mass_market_lag2 is_mass_market_lag3 product_vertical_djcontroller product_vertical_djcontroller_lag1 product_vertical_djcontroller_lag2 product_vertical_djcontroller_lag3 product_vertical_dock product_vertical_dock_lag1 product_vertical_dock_lag2 product_vertical_dock_lag3 product_vertical_dockingstation product_vertical_dockingstation_lag1 product_vertical_dockingstation_lag2 product_vertical_dockingstation_lag3 product_vertical_fmradio product_vertical_fmradio_lag1 product_vertical_fmradio_lag2 product_vertical_fmradio_lag3 product_vertical_hifisystem product_vertical_hifisystem_lag1 product_vertical_hifisystem_lag2 product_vertical_hifisystem_lag3 product_vertical_homeaudiospeaker product_vertical_homeaudiospeaker_lag1 product_vertical_homeaudiospeaker_lag2 product_vertical_homeaudiospeaker_lag3 product_vertical_karaokeplayer product_vertical_karaokeplayer_lag1 product_vertical_karaokeplayer_lag2 product_vertical_karaokeplayer_lag3 product_vertical_slingbox product_vertical_slingbox_lag1 product_vertical_slingbox_lag2 product_vertical_slingbox_lag3 product_vertical_soundmixer product_vertical_soundmixer_lag1 product_vertical_soundmixer_lag2 product_vertical_soundmixer_lag3 product_vertical_voicerecorder product_vertical_voicerecorder_lag1 product_vertical_voicerecorder_lag2 product_vertical_voicerecorder_lag3 payday_week payday_week_lag1 payday_week_lag2 payday_week_lag3 holiday_week holiday_week_lag1 holiday_week_lag2 holiday_week_lag3 Total Investment Total Investment_lag1 Total Investment_lag2 Total Investment_lag3 Total Investment_SMA_3 Total Investment_SMA_3_lag1 Total Investment_SMA_3_lag2 Total Investment_SMA_3_lag3 Total Investment_SMA_5 Total Investment_SMA_5_lag1 Total Investment_SMA_5_lag2 Total Investment_SMA_5_lag3 Total Investment_EMA_8 Total Investment_EMA_8_lag1 Total Investment_EMA_8_lag2 Total Investment_EMA_8_lag3 Total_Investment_Ad_Stock Total_Investment_Ad_Stock_lag1 Total_Investment_Ad_Stock_lag2 Total_Investment_Ad_Stock_lag3 TV TV_lag1 TV_lag2 TV_lag3 TV_SMA_3 TV_SMA_3_lag1 TV_SMA_3_lag2 TV_SMA_3_lag3 TV_SMA_5 TV_SMA_5_lag1 TV_SMA_5_lag2 TV_SMA_5_lag3 TV_EMA_8 TV_EMA_8_lag1 TV_EMA_8_lag2 TV_EMA_8_lag3 TV_Ad_Stock TV_Ad_Stock_lag1 TV_Ad_Stock_lag2 TV_Ad_Stock_lag3 Digital Digital_lag1 Digital_lag2 Digital_lag3 Digital_SMA_3 Digital_SMA_3_lag1 Digital_SMA_3_lag2 Digital_SMA_3_lag3 Digital_SMA_5 Digital_SMA_5_lag1 Digital_SMA_5_lag2 Digital_SMA_5_lag3 Digital_EMA_8 Digital_EMA_8_lag1 Digital_EMA_8_lag2 Digital_EMA_8_lag3 Digital_Ad_Stock Digital_Ad_Stock_lag1 Digital_Ad_Stock_lag2 Digital_Ad_Stock_lag3 Sponsorship Sponsorship_lag1 Sponsorship_lag2 Sponsorship_lag3 Sponsorship_SMA_3 Sponsorship_SMA_3_lag1 Sponsorship_SMA_3_lag2 Sponsorship_SMA_3_lag3 Sponsorship_SMA_5 Sponsorship_SMA_5_lag1 Sponsorship_SMA_5_lag2 Sponsorship_SMA_5_lag3 Sponsorship_EMA_8 Sponsorship_EMA_8_lag1 Sponsorship_EMA_8_lag2 Sponsorship_EMA_8_lag3 Sponsorship_Ad_Stock Sponsorship_Ad_Stock_lag1 Sponsorship_Ad_Stock_lag2 Sponsorship_Ad_Stock_lag3 Content Marketing Content Marketing_lag1 Content Marketing_lag2 Content Marketing_lag3 Content Marketing_SMA_3 Content Marketing_SMA_3_lag1 Content Marketing_SMA_3_lag2 Content Marketing_SMA_3_lag3 Content Marketing_SMA_5 Content Marketing_SMA_5_lag1 Content Marketing_SMA_5_lag2 Content Marketing_SMA_5_lag3 Content Marketing_EMA_8 Content Marketing_EMA_8_lag1 Content Marketing_EMA_8_lag2 Content Marketing_EMA_8_lag3 Content_Marketing_Ad_Stock Content_Marketing_Ad_Stock_lag1 Content_Marketing_Ad_Stock_lag2 Content_Marketing_Ad_Stock_lag3 Online marketing Online marketing_lag1 Online marketing_lag2 Online marketing_lag3 Online marketing_SMA_3 Online marketing_SMA_3_lag1 Online marketing_SMA_3_lag2 Online marketing_SMA_3_lag3 Online marketing_SMA_5 Online marketing_SMA_5_lag1 Online marketing_SMA_5_lag2 Online marketing_SMA_5_lag3 Online marketing_EMA_8 Online marketing_EMA_8_lag1 Online marketing_EMA_8_lag2 Online marketing_EMA_8_lag3 Online_marketing_Ad_Stock Online_marketing_Ad_Stock_lag1 Online_marketing_Ad_Stock_lag2 Online_marketing_Ad_Stock_lag3 Affiliates Affiliates_lag1 Affiliates_lag2 Affiliates_lag3 Affiliates_SMA_3 Affiliates_SMA_3_lag1 Affiliates_SMA_3_lag2 Affiliates_SMA_3_lag3 Affiliates_SMA_5 Affiliates_SMA_5_lag1 Affiliates_SMA_5_lag2 Affiliates_SMA_5_lag3 Affiliates_EMA_8 Affiliates_EMA_8_lag1 Affiliates_EMA_8_lag2 Affiliates_EMA_8_lag3 Affiliates_Ad_Stock Affiliates_Ad_Stock_lag1 Affiliates_Ad_Stock_lag2 Affiliates_Ad_Stock_lag3 SEM SEM_lag1 SEM_lag2 SEM_lag3 SEM_SMA_3 SEM_SMA_3_lag1 SEM_SMA_3_lag2 SEM_SMA_3_lag3 SEM_SMA_5 SEM_SMA_5_lag1 SEM_SMA_5_lag2 SEM_SMA_5_lag3 SEM_EMA_8 SEM_EMA_8_lag1 SEM_EMA_8_lag2 SEM_EMA_8_lag3 SEM_Ad_Stock SEM_Ad_Stock_lag1 SEM_Ad_Stock_lag2 SEM_Ad_Stock_lag3 Radio Radio_lag1 Radio_lag2 Radio_lag3 Radio_SMA_3 Radio_SMA_3_lag1 Radio_SMA_3_lag2 Radio_SMA_3_lag3 Radio_SMA_5 Radio_SMA_5_lag1 Radio_SMA_5_lag2 Radio_SMA_5_lag3 Radio_EMA_8 Radio_EMA_8_lag1 Radio_EMA_8_lag2 Radio_EMA_8_lag3 Radio_Ad_Stock Radio_Ad_Stock_lag1 Radio_Ad_Stock_lag2 Radio_Ad_Stock_lag3 Other Other_lag1 Other_lag2 Other_lag3 Other_SMA_3 Other_SMA_3_lag1 Other_SMA_3_lag2 Other_SMA_3_lag3 Other_SMA_5 Other_SMA_5_lag1 Other_SMA_5_lag2 Other_SMA_5_lag3 Other_EMA_8 Other_EMA_8_lag1 Other_EMA_8_lag2 Other_EMA_8_lag3 Other_Ad_Stock Other_Ad_Stock_lag1 Other_Ad_Stock_lag2 Other_Ad_Stock_lag3 NPS NPS_lag1 NPS_lag2 NPS_lag3 NPS_SMA_3 NPS_SMA_3_lag1 NPS_SMA_3_lag2 NPS_SMA_3_lag3 NPS_SMA_5 NPS_SMA_5_lag1 NPS_SMA_5_lag2 NPS_SMA_5_lag3 Stock Index Stock Index_lag1 Stock Index_lag2 Stock Index_lag3 Stock Index_SMA_3 Stock Index_SMA_3_lag1 Stock Index_SMA_3_lag2 Stock Index_SMA_3_lag3 Stock Index_SMA_5 Stock Index_SMA_5_lag1 Stock Index_SMA_5_lag2 Stock Index_SMA_5_lag3 Max Temp Max Temp_lag1 Max Temp_lag2 Max Temp_lag3 Min Temp Min Temp_lag1 Min Temp_lag2 Min Temp_lag3 Mean Temp Mean Temp_lag1 Mean Temp_lag2 Mean Temp_lag3 Heat Deg Days Heat Deg Days_lag1 Heat Deg Days_lag2 Heat Deg Days_lag3 Cool Deg Days Cool Deg Days_lag1 Cool Deg Days_lag2 Cool Deg Days_lag3 Total Rain (mm) Total Rain (mm)_lag1 Total Rain (mm)_lag2 Total Rain (mm)_lag3 Total Snow (cm) Total Snow (cm)_lag1 Total Snow (cm)_lag2 Total Snow (cm)_lag3 Total Precip (mm) Total Precip (mm)_lag1 Total Precip (mm)_lag2 Total Precip (mm)_lag3 Snow on Grnd (cm) Snow on Grnd (cm)_lag1 Snow on Grnd (cm)_lag2 Snow on Grnd (cm)_lag3 Sale Sale_lag1 Sale_lag2 Sale_lag3
12 -0.417 -0.390 0.031 -0.416 0.000 0.021 0.241 1.839 0.544 -0.277 -0.484 1.854 0.561 -0.248 -0.486 0.152 -0.176 0.063 -0.061 0.222 0.164 0.077 0.111 -1.768 -0.879 -0.329 0.193 -0.455 -0.390 -0.136 0.007 0.461 -0.321 1.260 -1.063 -0.318 -0.673 -0.619 -0.286 0.999 0.480 -0.346 0.897 -0.180 -0.164 0.226 0.129 0.675 -0.619 0.601 -0.363 -0.598 -0.431 -0.350 0.077 -0.146 -0.146 -0.146 -0.146 6.119 -0.197 -0.197 -0.197 0.223 0.239 -0.143 0.710 1.148 1.125 0.584 0.595 -1.043 1.000 -1.000 1.043 -0.545 1.834 1.949 2.082 -0.283 0.748 0.752 0.757 0.470 0.842 0.844 0.357 0.708 0.615 0.305 0.015 0.394 0.606 0.440 0.244 0.331 0.755 0.617 0.400 0.628 2.218 2.200 2.184 1.853 2.406 2.382 1.386 2.252 1.947 1.295 0.664 1.697 1.873 1.486 1.022 1.682 2.252 1.949 1.478 -0.482 -0.084 -0.075 -0.065 -0.222 -0.063 -0.052 -0.111 -0.131 -0.066 -0.102 -0.140 -0.437 -0.291 -0.309 -0.338 -0.329 -0.120 -0.131 -0.158 -0.242 0.469 0.483 0.497 0.280 0.552 0.566 0.121 0.475 0.349 0.066 -0.208 0.067 0.193 0.028 -0.180 0.152 0.440 0.320 0.115 -0.678 -0.306 -0.305 -0.304 -0.493 -0.349 -0.348 -0.297 -0.500 -0.365 -0.325 -0.280 -0.515 -0.327 -0.269 -0.196 -0.587 -0.374 -0.350 -0.312 0.125 0.412 0.420 0.428 0.344 0.444 0.451 0.356 0.410 0.405 0.356 0.316 0.465 0.534 0.536 0.540 0.333 0.445 0.431 0.408 0.358 0.612 0.612 0.614 0.556 0.639 0.639 0.498 0.607 0.565 0.486 0.419 0.631 0.672 0.653 0.635 0.540 0.622 0.591 0.545 -0.413 -0.274 -0.262 -0.250 -0.340 -0.278 -0.265 -0.311 -0.310 -0.303 -0.326 -0.348 -0.566 -0.494 -0.476 -0.461 -0.430 -0.355 -0.354 -0.363 -0.476 0.771 0.771 0.771 0.435 0.945 0.945 0.444 0.651 0.655 0.337 0.029 0.566 0.975 0.887 0.778 0.256 0.887 0.772 0.584 -0.456 1.658 1.658 1.658 1.162 2.021 2.021 1.164 1.617 1.617 1.054 0.496 1.283 1.880 1.620 1.291 0.913 1.958 1.733 1.359 0.769 0.124 0.194 0.249 0.292 0.255 0.301 0.370 0.340 0.358 0.410 0.457 0.970 -0.675 -0.405 -0.256 -0.101 -0.260 -0.154 0.121 -0.019 0.042 0.210 0.359 -0.978 -0.413 0.126 -0.258 -1.278 -0.859 -0.665 -0.529 -1.209 -0.506 -0.585 -0.164 1.287 0.494 0.656 0.209 -0.641 -0.613 -0.585 -0.554 -0.155 -0.014 2.793 -0.655 0.330 0.321 -0.368 -0.368 -0.076 0.054 2.531 -0.688 -0.275 -0.275 -0.275 -0.275 -0.435 -0.435 -0.435 -0.435
10 -0.023 0.753 0.107 -0.239 0.158 0.668 0.052 -0.347 -0.550 -0.560 -0.540 -0.317 -0.551 -0.561 -0.540 -0.099 -0.252 -0.180 0.207 -0.403 -0.088 0.237 0.288 -0.385 0.150 0.718 0.281 -0.176 -0.043 0.619 0.176 1.267 -1.112 -0.716 0.483 -0.638 -0.314 0.130 0.335 -0.422 0.830 0.866 -0.304 0.202 0.065 0.468 0.394 0.602 -0.404 1.092 1.564 -0.402 0.034 0.837 0.093 -0.146 -0.146 -0.146 -0.146 -0.197 -0.197 -0.197 -0.197 -0.159 0.620 0.239 -0.493 0.582 0.545 0.746 0.502 -1.043 1.000 -1.000 -0.959 1.834 1.834 -0.513 2.082 0.744 0.748 0.752 -0.615 0.842 0.326 -0.163 -0.626 0.268 -0.043 -0.334 -0.608 0.409 0.189 -0.071 -0.385 0.604 0.368 -0.006 -0.608 2.237 2.218 2.200 -0.540 2.432 1.401 0.404 -0.565 1.312 0.659 0.028 -0.586 1.529 1.034 0.437 -0.290 1.995 1.502 0.722 -0.531 -0.094 -0.084 -0.075 -0.242 -0.074 -0.131 -0.187 -0.245 -0.124 -0.159 -0.195 -0.231 -0.345 -0.372 -0.411 -0.467 -0.156 -0.181 -0.232 -0.326 0.454 0.469 0.483 -0.748 0.538 0.079 -0.365 -0.796 0.017 -0.265 -0.538 -0.802 -0.040 -0.260 -0.536 -0.884 0.280 0.066 -0.291 -0.883 -0.307 -0.306 -0.305 -0.189 -0.351 -0.306 -0.260 -0.210 -0.341 -0.305 -0.264 -0.221 -0.292 -0.223 -0.137 -0.028 -0.361 -0.327 -0.272 -0.183 0.404 0.412 0.420 0.099 0.438 0.326 0.222 0.135 0.319 0.265 0.221 0.186 0.510 0.505 0.500 0.496 0.409 0.377 0.324 0.236 0.612 0.612 0.612 0.167 0.640 0.479 0.332 0.203 0.461 0.378 0.308 0.247 0.640 0.610 0.578 0.542 0.583 0.526 0.438 0.300 -0.286 -0.274 -0.262 -0.415 -0.292 -0.341 -0.389 -0.434 -0.363 -0.387 -0.409 -0.430 -0.540 -0.524 -0.513 -0.508 -0.391 -0.402 -0.431 -0.491 0.771 0.771 0.771 -0.476 0.945 0.435 -0.074 -0.571 0.319 -0.006 -0.321 -0.628 0.869 0.749 0.598 0.408 0.765 0.569 0.244 -0.292 1.658 1.658 1.658 -0.456 2.021 1.162 0.303 -0.550 1.050 0.485 -0.075 -0.628 1.620 1.284 0.858 0.316 1.733 1.357 0.732 -0.305 -0.022 0.124 0.194 0.354 0.201 0.290 0.364 0.427 0.337 0.392 0.441 0.487 -1.866 -0.675 -0.405 0.452 -0.413 -0.022 0.267 0.503 0.096 0.272 0.423 0.559 -0.021 -0.413 0.075 -0.809 -0.707 -0.570 -0.488 -0.825 -0.695 -0.272 -0.068 -0.944 0.656 0.208 0.029 1.153 -0.641 -0.613 -0.585 -0.554 2.802 -0.670 1.725 -0.515 -0.368 -0.368 -0.368 2.593 2.538 -0.702 1.534 0.053 -0.275 -0.275 -0.275 1.394 -0.435 -0.435 1.523 -0.435
32 -0.238 -0.321 -1.672 -0.106 0.096 0.182 -1.805 -0.639 -0.607 -0.576 -0.544 -0.639 -0.608 -0.577 -0.545 0.583 -0.069 0.022 2.186 0.241 0.711 0.431 -0.746 -0.268 -0.039 -0.251 -1.657 -0.215 0.013 -0.185 -1.702 1.267 1.260 0.469 -1.450 0.561 0.602 0.406 -1.452 -0.600 -0.742 -1.039 -1.162 0.148 0.558 0.142 -2.267 0.237 0.458 0.531 -1.945 -0.508 -0.369 -0.454 -1.603 -0.146 -0.146 -0.146 -0.146 -0.197 -0.197 -0.197 -0.197 -0.542 -0.525 -0.525 -0.493 -0.871 0.409 -0.065 -2.062 0.959 -1.000 1.000 -0.959 -0.545 -0.545 -0.513 -0.480 0.654 0.660 -1.430 -1.382 -0.049 -0.797 -1.510 -1.241 -0.680 -1.126 -1.546 -1.214 -0.858 -1.314 -1.875 -1.551 -0.110 -0.707 -1.671 -1.441 0.108 0.120 -1.348 -1.317 -0.414 -0.932 -1.433 -1.344 -0.888 -1.203 -1.511 -1.404 -1.051 -1.353 -1.736 -1.614 -0.499 -0.902 -1.570 -1.485 -0.329 -0.318 -0.411 -0.400 -0.382 -0.409 -0.435 -0.068 -0.418 -0.432 -0.447 0.050 -0.575 -0.543 -0.511 -0.087 -0.453 -0.462 -0.488 -0.217 1.308 1.310 -1.112 -1.078 0.563 -0.333 -1.200 -0.973 -0.132 -0.696 -1.239 -0.953 -0.130 -0.762 -1.541 -1.268 0.590 -0.142 -1.331 -1.138 -0.042 -0.041 -0.715 -0.713 -0.305 -0.562 -0.817 -0.808 -0.588 -0.759 -0.923 -0.904 -0.745 -0.930 -1.164 -1.140 -0.375 -0.587 -0.937 -0.925 0.106 0.125 -2.049 -1.962 -0.659 -1.376 -2.043 -1.825 -1.257 -1.628 -1.963 -1.724 -1.303 -1.617 -2.016 -1.807 -0.740 -1.249 -2.076 -1.884 0.041 0.065 -2.130 -2.026 -0.723 -1.428 -2.079 -1.825 -1.288 -1.646 -1.969 -1.696 -1.335 -1.632 -2.012 -1.779 -0.797 -1.291 -2.097 -1.881 -0.154 -0.143 -0.671 -0.656 -0.349 -0.538 -0.724 -0.417 -0.525 -0.641 -0.754 -0.347 -0.710 -0.803 -0.927 -0.569 -0.421 -0.565 -0.814 -0.584 -0.476 -0.476 -0.476 -0.476 -0.584 -0.584 -0.584 -0.571 -0.677 -0.667 -0.650 -0.628 -0.845 -0.826 -0.804 -0.777 -0.686 -0.680 -0.670 -0.655 -0.456 -0.456 -0.456 -0.456 -0.555 -0.555 -0.555 -0.550 -0.652 -0.647 -0.640 -0.628 -0.814 -0.804 -0.790 -0.774 -0.657 -0.654 -0.650 -0.642 -0.613 -0.143 1.247 1.135 0.421 0.794 1.090 0.824 0.714 0.888 1.044 0.792 -0.792 -0.209 0.372 0.397 0.086 0.279 0.428 0.405 0.299 0.399 0.489 0.464 0.723 1.299 1.298 0.995 1.175 0.761 1.803 1.539 1.011 0.923 1.595 1.321 -1.075 -0.895 -1.075 -1.073 0.538 0.620 2.770 1.373 -0.509 -0.540 2.602 -0.697 -0.368 -0.368 -0.368 -0.368 -0.552 -0.581 2.353 -0.727 -0.275 -0.275 -0.275 -0.275 -0.435 -0.435 -0.435 -0.435
22 -0.383 -0.392 -0.232 -1.292 -0.599 -0.127 0.222 1.691 2.105 1.887 1.552 1.694 2.146 1.862 1.573 -0.740 0.035 -0.349 -0.674 0.267 -0.057 0.206 0.102 -0.438 -0.290 -0.372 -0.269 -0.498 -0.386 -0.431 -0.392 -0.344 -1.507 -0.716 0.870 -0.878 -0.912 -0.777 -1.219 0.644 1.878 1.559 1.755 -0.186 0.226 -0.569 -1.200 -1.515 -0.691 -0.311 -0.225 -0.600 -0.579 -0.462 -0.106 -0.146 -0.146 -0.146 -0.146 -0.197 -0.197 -0.197 -0.197 2.137 2.148 0.239 1.111 1.007 1.943 -0.195 -0.780 0.959 -1.000 1.000 -0.959 1.834 -0.545 -0.513 2.082 -0.616 -0.581 -0.103 -0.081 -0.465 -0.266 -0.078 -0.051 -0.304 -0.164 -0.031 -0.029 -0.389 -0.205 -0.005 0.042 -0.510 -0.331 -0.077 -0.048 -0.763 -0.738 -0.931 -0.906 -0.886 -0.936 -0.985 -0.959 -0.993 -1.012 -1.032 -0.651 -0.739 -0.623 -0.493 -0.261 -0.925 -0.915 -0.922 -0.787 -0.309 -0.298 -0.538 -0.527 -0.416 -0.499 -0.581 -0.569 -0.508 -0.559 -0.612 -0.580 -0.787 -0.823 -0.876 -0.826 -0.508 -0.571 -0.686 -0.662 -0.212 -0.188 -0.153 -0.131 -0.199 -0.168 -0.139 -0.114 -0.152 -0.121 -0.092 -0.076 -0.196 -0.148 -0.101 -0.059 -0.209 -0.174 -0.138 -0.108 -0.669 -0.668 -0.018 -0.017 -0.518 -0.269 -0.020 -0.014 -0.366 -0.188 -0.009 -0.168 -0.590 -0.426 -0.221 -0.272 -0.599 -0.401 -0.075 -0.109 -1.115 -1.049 0.485 0.491 -0.562 -0.002 0.517 0.529 -0.102 0.238 0.556 0.512 -0.095 0.221 0.586 0.582 -0.539 -0.105 0.536 0.532 -1.064 -0.990 0.267 0.282 -0.589 -0.127 0.296 0.317 -0.205 0.073 0.333 0.378 -0.137 0.141 0.459 0.494 -0.558 -0.193 0.336 0.363 -0.476 -0.462 -0.219 -0.207 -0.418 -0.316 -0.216 -0.202 -0.341 -0.265 -0.190 -0.213 -0.606 -0.514 -0.409 -0.398 -0.492 -0.403 -0.269 -0.261 -0.476 -0.476 0.782 0.782 -0.070 0.445 0.959 0.966 0.328 0.668 1.011 0.697 0.214 0.526 0.923 0.818 -0.094 0.304 0.964 0.892 -0.456 -0.456 0.073 0.073 -0.341 -0.126 0.089 0.093 -0.226 -0.081 0.066 -0.066 -0.253 -0.086 0.128 0.145 -0.402 -0.230 0.056 0.041 0.399 0.314 0.030 0.110 0.243 0.202 0.177 0.229 0.269 0.270 0.274 0.366 0.370 0.295 0.461 0.473 0.372 0.436 0.495 0.520 0.453 0.500 0.545 0.580 1.308 1.559 1.094 1.245 0.662 0.761 1.392 0.889 1.097 1.092 1.243 1.496 -1.015 -0.995 -1.075 -1.073 1.081 0.985 1.198 2.165 -0.509 -0.811 0.376 -0.693 -0.368 -0.368 -0.368 -0.368 -0.552 -0.833 0.274 -0.722 -0.275 -0.275 -0.275 -0.275 -0.435 -0.435 -0.435 1.523
45 0.818 -0.107 -0.275 0.693 1.601 0.565 -0.033 -0.639 -0.607 -0.576 -0.544 -0.639 -0.608 -0.577 -0.545 0.210 -0.555 0.162 0.602 0.979 0.900 0.452 0.061 0.194 1.029 0.051 -0.099 -0.282 0.522 -0.066 -0.198 0.461 -0.321 0.074 -0.290 0.122 0.163 0.643 -0.092 0.289 -0.568 -0.173 -0.647 -0.218 0.122 0.142 -0.151 0.164 -0.188 -0.522 -0.363 0.107 1.029 -0.101 -0.203 -0.146 -0.146 -0.146 -0.146 -0.197 -0.197 -0.197 -0.197 -0.542 -0.525 -0.525 -0.493 -0.410 -0.103 -0.260 -0.749 0.959 -1.000 1.000 -0.959 -0.545 -0.545 -0.513 -0.480 0.904 0.904 -0.591 -0.559 0.449 -0.091 -0.604 -0.565 0.045 -0.282 -0.588 0.109 0.624 0.384 0.103 0.371 0.574 0.192 -0.415 -0.210 0.703 0.706 -0.061 -0.047 0.491 0.216 -0.049 -0.036 0.304 0.133 -0.033 0.219 0.527 0.384 0.207 0.284 0.560 0.362 0.038 0.107 0.210 0.218 -0.412 -0.401 0.028 -0.206 -0.437 -0.426 -0.138 -0.293 -0.448 0.498 0.394 0.411 0.430 0.769 0.121 0.004 -0.197 0.035 1.063 1.069 -0.702 -0.673 0.538 -0.115 -0.746 -0.712 0.055 -0.354 -0.748 -0.036 0.718 0.403 0.019 0.334 0.676 0.203 -0.566 -0.355 0.461 0.461 -0.565 -0.564 0.136 -0.255 -0.646 -0.637 -0.202 -0.468 -0.726 0.241 0.468 0.372 0.251 0.589 0.247 0.002 -0.405 -0.175 0.994 0.979 0.044 0.064 0.699 0.374 0.072 0.101 0.459 0.278 0.118 0.377 0.659 0.509 0.337 0.406 0.754 0.527 0.181 0.259 0.910 0.896 0.194 0.211 0.685 0.444 0.224 0.247 0.507 0.377 0.263 0.430 0.631 0.514 0.379 0.416 0.724 0.555 0.299 0.351 0.494 0.501 -0.393 -0.380 0.244 -0.087 -0.411 -0.395 0.015 -0.199 -0.407 0.494 0.626 0.558 0.473 0.795 0.369 0.170 -0.162 0.065 -0.476 -0.476 -0.476 -0.476 -0.584 -0.584 -0.584 -0.571 -0.677 -0.667 -0.650 -0.628 -0.845 -0.826 -0.804 -0.777 -0.686 -0.680 -0.670 -0.655 -0.456 -0.456 -0.456 -0.456 -0.555 -0.555 -0.555 -0.550 -0.652 -0.647 -0.640 -0.628 -0.814 -0.804 -0.790 -0.774 -0.657 -0.654 -0.650 -0.642 -0.930 -0.286 0.001 0.085 -0.073 0.057 0.155 0.209 0.119 0.192 0.255 0.268 -1.579 -0.550 0.482 0.490 -0.050 0.268 0.510 0.533 0.271 0.424 0.558 0.575 -1.616 -0.413 -1.198 -0.659 -0.878 -0.570 -0.724 0.180 -1.364 -0.315 -1.042 -0.292 1.477 0.261 1.209 0.364 -0.641 -0.613 -0.585 -0.554 0.170 -0.769 0.319 2.452 -0.368 -0.368 -0.368 -0.368 0.082 -0.794 0.222 2.213 -0.275 -0.275 -0.275 -0.275 -0.435 -0.435 -0.435 -0.435

Dividing into X and Y test sets for the model building for 3 dataframes

In [352]:
y_cameraaccessory_dladd_test = cameraaccessory_dladd_test.pop('gmv')
X_cameraaccessory_dladd_test = cameraaccessory_dladd_test

y_gamingaccessory_dladd_test = gamingaccessory_dladd_test.pop('gmv')
X_gamingaccessory_dladd_test = gamingaccessory_dladd_test

y_homeaudio_dladd_test = homeaudio_dladd_test.pop('gmv')
X_homeaudio_dladd_test = homeaudio_dladd_test

X_homeaudio_dladd_test.head()
Out[352]:
gmv_lag1 gmv_lag2 gmv_lag3 Discount% Discount%_lag1 Discount%_lag2 Discount%_lag3 deliverybdays deliverybdays_lag1 deliverybdays_lag2 deliverybdays_lag3 deliverycdays deliverycdays_lag1 deliverycdays_lag2 deliverycdays_lag3 sla sla_lag1 sla_lag2 sla_lag3 product_procurement_sla product_procurement_sla_lag1 product_procurement_sla_lag2 product_procurement_sla_lag3 is_cod is_cod_lag1 is_cod_lag2 is_cod_lag3 is_mass_market is_mass_market_lag1 is_mass_market_lag2 is_mass_market_lag3 product_vertical_djcontroller product_vertical_djcontroller_lag1 product_vertical_djcontroller_lag2 product_vertical_djcontroller_lag3 product_vertical_dock product_vertical_dock_lag1 product_vertical_dock_lag2 product_vertical_dock_lag3 product_vertical_dockingstation product_vertical_dockingstation_lag1 product_vertical_dockingstation_lag2 product_vertical_dockingstation_lag3 product_vertical_fmradio product_vertical_fmradio_lag1 product_vertical_fmradio_lag2 product_vertical_fmradio_lag3 product_vertical_hifisystem product_vertical_hifisystem_lag1 product_vertical_hifisystem_lag2 product_vertical_hifisystem_lag3 product_vertical_homeaudiospeaker product_vertical_homeaudiospeaker_lag1 product_vertical_homeaudiospeaker_lag2 product_vertical_homeaudiospeaker_lag3 product_vertical_karaokeplayer product_vertical_karaokeplayer_lag1 product_vertical_karaokeplayer_lag2 product_vertical_karaokeplayer_lag3 product_vertical_slingbox product_vertical_slingbox_lag1 product_vertical_slingbox_lag2 product_vertical_slingbox_lag3 product_vertical_soundmixer product_vertical_soundmixer_lag1 product_vertical_soundmixer_lag2 product_vertical_soundmixer_lag3 product_vertical_voicerecorder product_vertical_voicerecorder_lag1 product_vertical_voicerecorder_lag2 product_vertical_voicerecorder_lag3 payday_week payday_week_lag1 payday_week_lag2 payday_week_lag3 holiday_week holiday_week_lag1 holiday_week_lag2 holiday_week_lag3 Total Investment Total Investment_lag1 Total Investment_lag2 Total Investment_lag3 Total Investment_SMA_3 Total Investment_SMA_3_lag1 Total Investment_SMA_3_lag2 Total Investment_SMA_3_lag3 Total Investment_SMA_5 Total Investment_SMA_5_lag1 Total Investment_SMA_5_lag2 Total Investment_SMA_5_lag3 Total Investment_EMA_8 Total Investment_EMA_8_lag1 Total Investment_EMA_8_lag2 Total Investment_EMA_8_lag3 Total_Investment_Ad_Stock Total_Investment_Ad_Stock_lag1 Total_Investment_Ad_Stock_lag2 Total_Investment_Ad_Stock_lag3 TV TV_lag1 TV_lag2 TV_lag3 TV_SMA_3 TV_SMA_3_lag1 TV_SMA_3_lag2 TV_SMA_3_lag3 TV_SMA_5 TV_SMA_5_lag1 TV_SMA_5_lag2 TV_SMA_5_lag3 TV_EMA_8 TV_EMA_8_lag1 TV_EMA_8_lag2 TV_EMA_8_lag3 TV_Ad_Stock TV_Ad_Stock_lag1 TV_Ad_Stock_lag2 TV_Ad_Stock_lag3 Digital Digital_lag1 Digital_lag2 Digital_lag3 Digital_SMA_3 Digital_SMA_3_lag1 Digital_SMA_3_lag2 Digital_SMA_3_lag3 Digital_SMA_5 Digital_SMA_5_lag1 Digital_SMA_5_lag2 Digital_SMA_5_lag3 Digital_EMA_8 Digital_EMA_8_lag1 Digital_EMA_8_lag2 Digital_EMA_8_lag3 Digital_Ad_Stock Digital_Ad_Stock_lag1 Digital_Ad_Stock_lag2 Digital_Ad_Stock_lag3 Sponsorship Sponsorship_lag1 Sponsorship_lag2 Sponsorship_lag3 Sponsorship_SMA_3 Sponsorship_SMA_3_lag1 Sponsorship_SMA_3_lag2 Sponsorship_SMA_3_lag3 Sponsorship_SMA_5 Sponsorship_SMA_5_lag1 Sponsorship_SMA_5_lag2 Sponsorship_SMA_5_lag3 Sponsorship_EMA_8 Sponsorship_EMA_8_lag1 Sponsorship_EMA_8_lag2 Sponsorship_EMA_8_lag3 Sponsorship_Ad_Stock Sponsorship_Ad_Stock_lag1 Sponsorship_Ad_Stock_lag2 Sponsorship_Ad_Stock_lag3 Content Marketing Content Marketing_lag1 Content Marketing_lag2 Content Marketing_lag3 Content Marketing_SMA_3 Content Marketing_SMA_3_lag1 Content Marketing_SMA_3_lag2 Content Marketing_SMA_3_lag3 Content Marketing_SMA_5 Content Marketing_SMA_5_lag1 Content Marketing_SMA_5_lag2 Content Marketing_SMA_5_lag3 Content Marketing_EMA_8 Content Marketing_EMA_8_lag1 Content Marketing_EMA_8_lag2 Content Marketing_EMA_8_lag3 Content_Marketing_Ad_Stock Content_Marketing_Ad_Stock_lag1 Content_Marketing_Ad_Stock_lag2 Content_Marketing_Ad_Stock_lag3 Online marketing Online marketing_lag1 Online marketing_lag2 Online marketing_lag3 Online marketing_SMA_3 Online marketing_SMA_3_lag1 Online marketing_SMA_3_lag2 Online marketing_SMA_3_lag3 Online marketing_SMA_5 Online marketing_SMA_5_lag1 Online marketing_SMA_5_lag2 Online marketing_SMA_5_lag3 Online marketing_EMA_8 Online marketing_EMA_8_lag1 Online marketing_EMA_8_lag2 Online marketing_EMA_8_lag3 Online_marketing_Ad_Stock Online_marketing_Ad_Stock_lag1 Online_marketing_Ad_Stock_lag2 Online_marketing_Ad_Stock_lag3 Affiliates Affiliates_lag1 Affiliates_lag2 Affiliates_lag3 Affiliates_SMA_3 Affiliates_SMA_3_lag1 Affiliates_SMA_3_lag2 Affiliates_SMA_3_lag3 Affiliates_SMA_5 Affiliates_SMA_5_lag1 Affiliates_SMA_5_lag2 Affiliates_SMA_5_lag3 Affiliates_EMA_8 Affiliates_EMA_8_lag1 Affiliates_EMA_8_lag2 Affiliates_EMA_8_lag3 Affiliates_Ad_Stock Affiliates_Ad_Stock_lag1 Affiliates_Ad_Stock_lag2 Affiliates_Ad_Stock_lag3 SEM SEM_lag1 SEM_lag2 SEM_lag3 SEM_SMA_3 SEM_SMA_3_lag1 SEM_SMA_3_lag2 SEM_SMA_3_lag3 SEM_SMA_5 SEM_SMA_5_lag1 SEM_SMA_5_lag2 SEM_SMA_5_lag3 SEM_EMA_8 SEM_EMA_8_lag1 SEM_EMA_8_lag2 SEM_EMA_8_lag3 SEM_Ad_Stock SEM_Ad_Stock_lag1 SEM_Ad_Stock_lag2 SEM_Ad_Stock_lag3 Radio Radio_lag1 Radio_lag2 Radio_lag3 Radio_SMA_3 Radio_SMA_3_lag1 Radio_SMA_3_lag2 Radio_SMA_3_lag3 Radio_SMA_5 Radio_SMA_5_lag1 Radio_SMA_5_lag2 Radio_SMA_5_lag3 Radio_EMA_8 Radio_EMA_8_lag1 Radio_EMA_8_lag2 Radio_EMA_8_lag3 Radio_Ad_Stock Radio_Ad_Stock_lag1 Radio_Ad_Stock_lag2 Radio_Ad_Stock_lag3 Other Other_lag1 Other_lag2 Other_lag3 Other_SMA_3 Other_SMA_3_lag1 Other_SMA_3_lag2 Other_SMA_3_lag3 Other_SMA_5 Other_SMA_5_lag1 Other_SMA_5_lag2 Other_SMA_5_lag3 Other_EMA_8 Other_EMA_8_lag1 Other_EMA_8_lag2 Other_EMA_8_lag3 Other_Ad_Stock Other_Ad_Stock_lag1 Other_Ad_Stock_lag2 Other_Ad_Stock_lag3 NPS NPS_lag1 NPS_lag2 NPS_lag3 NPS_SMA_3 NPS_SMA_3_lag1 NPS_SMA_3_lag2 NPS_SMA_3_lag3 NPS_SMA_5 NPS_SMA_5_lag1 NPS_SMA_5_lag2 NPS_SMA_5_lag3 Stock Index Stock Index_lag1 Stock Index_lag2 Stock Index_lag3 Stock Index_SMA_3 Stock Index_SMA_3_lag1 Stock Index_SMA_3_lag2 Stock Index_SMA_3_lag3 Stock Index_SMA_5 Stock Index_SMA_5_lag1 Stock Index_SMA_5_lag2 Stock Index_SMA_5_lag3 Max Temp Max Temp_lag1 Max Temp_lag2 Max Temp_lag3 Min Temp Min Temp_lag1 Min Temp_lag2 Min Temp_lag3 Mean Temp Mean Temp_lag1 Mean Temp_lag2 Mean Temp_lag3 Heat Deg Days Heat Deg Days_lag1 Heat Deg Days_lag2 Heat Deg Days_lag3 Cool Deg Days Cool Deg Days_lag1 Cool Deg Days_lag2 Cool Deg Days_lag3 Total Rain (mm) Total Rain (mm)_lag1 Total Rain (mm)_lag2 Total Rain (mm)_lag3 Total Snow (cm) Total Snow (cm)_lag1 Total Snow (cm)_lag2 Total Snow (cm)_lag3 Total Precip (mm) Total Precip (mm)_lag1 Total Precip (mm)_lag2 Total Precip (mm)_lag3 Snow on Grnd (cm) Snow on Grnd (cm)_lag1 Snow on Grnd (cm)_lag2 Snow on Grnd (cm)_lag3 Sale Sale_lag1 Sale_lag2 Sale_lag3
31 -0.358 -1.750 -0.436 0.004 0.130 -2.155 -0.164 -0.639 -0.607 -0.576 -0.544 -0.639 -0.608 -0.577 -0.545 -0.191 -0.050 2.433 1.075 1.508 0.451 -0.993 0.362 -0.054 -0.283 -1.734 -0.401 0.001 -0.212 -1.781 -0.449 1.267 0.469 -1.507 -0.677 0.601 0.402 -1.487 0.219 -0.777 -1.092 -1.212 -1.162 0.562 0.122 -2.450 -0.310 0.456 0.529 -1.995 -0.569 -0.388 -0.490 -1.667 -0.509 -0.146 -0.146 -0.146 -0.146 -0.197 -0.197 -0.197 -0.197 -0.542 -0.525 -0.525 -0.493 0.404 -0.103 -2.207 -0.687 -1.043 1.000 -1.000 1.043 -0.545 -0.545 -0.513 -0.480 0.654 -1.482 -1.430 -1.092 -0.842 -1.570 -1.297 -1.137 -1.187 -1.619 -1.276 -1.579 -1.428 -2.008 -1.659 -1.458 -0.761 -1.755 -1.515 -1.325 0.108 -1.381 -1.348 -1.237 -0.962 -1.471 -1.375 -1.315 -1.239 -1.548 -1.436 -1.479 -1.417 -1.809 -1.682 -1.590 -0.936 -1.616 -1.526 -1.452 -0.329 -0.422 -0.411 0.069 -0.422 -0.449 -0.077 0.110 -0.445 -0.459 0.043 -0.849 -0.565 -0.531 -0.100 0.097 -0.478 -0.503 -0.229 -0.020 1.308 -1.149 -1.112 -0.824 -0.362 -1.243 -1.010 -0.879 -0.734 -1.286 -0.993 -1.234 -0.825 -1.625 -1.338 -1.179 -0.172 -1.385 -1.186 -1.029 -0.042 -0.716 -0.715 -0.713 -0.563 -0.819 -0.817 -0.808 -0.767 -0.937 -0.923 -0.904 -0.946 -1.186 -1.164 -1.140 -0.592 -0.946 -0.937 -0.925 0.106 -2.147 -2.049 -1.802 -1.455 -2.141 -1.932 -1.772 -1.731 -2.083 -1.832 -1.863 -1.745 -2.155 -1.929 -1.770 -1.339 -2.201 -1.999 -1.827 0.041 -2.247 -2.130 -1.817 -1.520 -2.191 -1.936 -1.756 -1.753 -2.090 -1.802 -1.881 -1.764 -2.154 -1.901 -1.731 -1.389 -2.230 -1.999 -1.808 -0.154 -0.686 -0.671 -0.272 -0.554 -0.741 -0.434 -0.274 -0.662 -0.777 -0.367 -1.049 -0.842 -0.969 -0.602 -0.425 -0.587 -0.839 -0.606 -0.427 -0.476 -0.476 -0.476 -0.476 -0.584 -0.584 -0.584 -0.571 -0.677 -0.667 -0.650 -0.628 -0.845 -0.826 -0.804 -0.777 -0.686 -0.680 -0.670 -0.655 -0.456 -0.456 -0.456 -0.456 -0.555 -0.555 -0.555 -0.550 -0.652 -0.647 -0.640 -0.628 -0.814 -0.804 -0.790 -0.774 -0.657 -0.654 -0.650 -0.642 -0.613 1.520 1.247 0.700 0.844 1.149 0.832 0.706 0.902 1.060 0.782 -2.383 -0.792 0.360 0.372 0.298 0.230 0.399 0.369 0.378 0.363 0.458 0.429 -2.398 1.255 1.247 0.942 1.496 0.719 1.744 1.509 1.657 0.860 1.526 1.268 1.661 -0.895 -1.075 -1.075 -1.073 0.524 2.548 1.310 2.913 -0.549 2.605 -0.707 0.502 -0.368 -0.368 -0.368 -0.368 -0.589 2.355 -0.736 0.393 -0.275 -0.275 -0.275 -0.275 -0.435 -0.435 -0.435 -0.435
5 0.003 0.842 0.054 1.370 0.483 0.952 0.400 -0.636 -0.604 -0.575 -0.542 -0.636 -0.605 -0.576 -0.540 -0.740 -0.246 -0.459 0.145 -0.306 0.492 0.465 0.287 0.910 0.246 1.122 0.203 0.520 -0.152 0.462 -0.180 -1.149 -0.716 0.074 -0.290 0.561 0.243 -0.383 -0.442 -0.600 0.131 -0.520 0.897 0.100 0.085 0.251 0.007 -0.566 -0.260 -0.522 0.188 0.971 0.091 1.109 0.075 -0.146 -0.146 -0.146 -0.146 -0.197 -0.197 -0.197 -0.197 -0.542 0.620 -0.525 -0.493 0.546 -0.035 0.519 0.314 0.959 1.000 -1.000 1.043 1.834 1.834 -0.513 -0.480 -0.721 -0.683 -0.648 0.181 -0.747 -0.406 -0.082 0.230 -0.355 -0.128 0.089 0.460 -0.276 -0.024 0.260 0.588 -0.599 -0.390 -0.091 0.360 -0.604 -0.582 -0.560 0.331 -0.635 -0.288 0.047 0.370 -0.236 -0.012 0.204 0.501 -0.133 0.071 0.315 0.608 -0.452 -0.267 0.016 0.460 -0.273 -0.263 -0.252 -0.577 -0.279 -0.395 -0.510 -0.626 -0.441 -0.515 -0.590 -0.455 -0.538 -0.524 -0.515 -0.510 -0.411 -0.443 -0.505 -0.618 -0.843 -0.810 -0.778 -0.948 -0.906 -0.943 -0.980 -1.017 -1.022 -1.028 -1.034 -0.557 -0.953 -0.798 -0.630 -0.443 -1.015 -0.959 -0.902 -0.841 -0.193 -0.191 -0.190 0.278 -0.219 -0.040 0.140 0.323 -0.002 0.126 0.256 0.436 0.096 0.216 0.367 0.558 -0.113 -0.018 0.140 0.399 0.038 0.059 0.080 1.009 0.068 0.417 0.742 1.049 0.506 0.709 0.901 1.076 0.623 0.752 0.908 1.099 0.314 0.473 0.714 1.087 0.105 0.127 0.147 1.108 0.138 0.495 0.825 1.134 0.579 0.780 0.970 1.107 0.668 0.789 0.934 1.113 0.384 0.540 0.775 1.141 -0.455 -0.441 -0.428 -0.377 -0.482 -0.452 -0.422 -0.391 -0.470 -0.442 -0.412 -0.161 -0.496 -0.387 -0.260 -0.110 -0.528 -0.480 -0.414 -0.320 -0.476 -0.476 -0.476 3.386 -0.584 0.994 2.572 4.146 1.378 2.404 3.426 3.437 1.153 1.725 2.454 3.387 0.333 1.014 2.143 4.012 -0.456 -0.456 -0.456 3.136 -0.555 0.904 2.363 3.819 1.276 2.239 3.197 3.194 1.017 1.540 2.208 3.058 0.287 0.917 1.964 3.702 0.346 0.290 0.320 0.093 0.328 0.273 0.240 0.215 0.301 0.299 0.300 0.288 0.720 0.447 0.437 -0.129 0.441 0.259 0.132 0.032 0.268 0.218 0.180 0.141 -0.340 -1.554 -0.638 -1.310 -1.677 -2.364 -0.900 -1.297 -0.759 -2.140 -0.879 -1.410 0.734 2.488 1.012 1.716 -0.641 -0.613 -0.585 -0.554 -0.820 -0.811 0.151 -0.795 -0.368 3.016 -0.368 -0.368 -0.842 -0.132 0.064 -0.818 0.560 0.977 -0.275 -0.275 1.523 0.218 0.870 -0.435
9 0.750 0.081 0.268 0.096 0.732 -0.013 0.288 -0.582 -0.591 -0.573 -0.540 -0.583 -0.592 -0.572 -0.541 -0.439 -0.294 0.163 0.181 -0.563 0.190 0.243 0.162 0.139 0.714 0.261 0.436 -0.055 0.615 0.153 0.167 -1.149 -0.716 0.469 0.097 -0.318 0.124 0.327 -0.170 0.822 0.830 -0.346 0.554 0.051 0.465 0.369 0.045 -0.420 1.104 1.583 0.876 0.022 0.836 0.072 0.343 -0.146 -0.146 -0.146 -0.146 -0.197 -0.197 -0.197 -0.197 0.606 0.239 -0.525 -0.493 0.546 0.750 0.454 0.658 0.959 -1.000 -1.000 -0.959 1.834 -0.545 1.949 -0.480 0.744 0.748 -0.648 -0.615 0.312 -0.190 -0.666 -0.626 -0.071 -0.372 -0.654 -0.427 0.160 -0.113 -0.441 -0.274 0.353 -0.035 -0.656 -0.554 2.237 2.218 -0.560 -0.540 1.410 0.396 -0.586 -0.565 0.656 0.015 -0.606 -0.386 1.043 0.427 -0.322 -0.188 1.517 0.721 -0.555 -0.475 -0.094 -0.084 -0.252 -0.242 -0.142 -0.199 -0.254 -0.245 -0.171 -0.205 -0.241 -0.318 -0.391 -0.430 -0.485 -0.472 -0.195 -0.245 -0.338 -0.342 0.454 0.469 -0.778 -0.748 0.057 -0.394 -0.831 -0.796 -0.296 -0.573 -0.840 -0.850 -0.304 -0.587 -0.943 -0.834 0.041 -0.323 -0.927 -0.881 -0.307 -0.306 -0.190 -0.189 -0.307 -0.262 -0.217 -0.210 -0.311 -0.275 -0.234 -0.099 -0.233 -0.148 -0.041 0.038 -0.331 -0.278 -0.192 -0.151 0.404 0.412 0.080 0.099 0.314 0.207 0.108 0.135 0.245 0.194 0.153 0.366 0.493 0.484 0.475 0.565 0.365 0.307 0.212 0.284 0.612 0.612 0.147 0.167 0.473 0.319 0.178 0.203 0.364 0.285 0.218 0.428 0.604 0.566 0.523 0.607 0.520 0.425 0.278 0.347 -0.286 -0.274 -0.428 -0.415 -0.356 -0.404 -0.451 -0.434 -0.405 -0.429 -0.451 -0.420 -0.557 -0.545 -0.540 -0.462 -0.421 -0.451 -0.513 -0.482 0.771 0.771 -0.476 -0.476 0.435 -0.074 -0.584 -0.571 -0.013 -0.337 -0.650 0.388 0.741 0.585 0.389 0.747 0.566 0.238 -0.306 -0.050 1.658 1.658 -0.456 -0.456 1.162 0.303 -0.555 -0.550 0.483 -0.081 -0.640 0.327 1.283 0.853 0.307 0.628 1.356 0.730 -0.311 -0.079 -0.022 0.124 0.320 0.354 0.244 0.326 0.395 0.427 0.356 0.409 0.457 0.450 -1.866 -0.675 0.437 0.452 -0.129 0.216 0.477 0.503 0.223 0.387 0.530 0.477 -0.500 0.002 -0.893 -1.761 -0.593 -0.512 -0.841 -0.647 -0.329 -0.126 -0.995 -1.276 0.208 0.029 1.153 1.554 -0.641 -0.613 -0.585 -0.554 -0.679 1.726 -0.524 2.760 -0.368 -0.368 2.593 3.439 -0.710 1.534 0.047 3.285 -0.275 -0.275 1.394 1.603 -0.435 1.523 -0.435 -0.435
3 0.001 0.081 0.871 1.464 0.386 0.538 1.099 -0.638 -0.605 -0.575 -0.544 -0.638 -0.603 -0.576 -0.545 -0.951 0.033 -0.051 -0.388 0.955 0.196 0.546 0.562 1.140 0.161 0.211 1.250 0.450 -0.241 0.120 1.290 0.059 -0.321 -0.321 0.483 -0.398 -0.474 0.406 -0.015 -0.600 0.830 0.173 -0.647 0.229 -0.070 -0.480 -0.202 -0.566 0.170 -0.803 -0.294 1.123 0.031 0.201 1.302 -0.146 -0.146 -0.146 -0.146 -0.197 -0.197 -0.197 -0.197 -0.542 -0.525 -0.143 -0.493 0.511 0.238 -0.097 0.376 -1.043 1.000 1.000 -0.959 -0.545 -0.545 -0.513 -0.480 -0.721 0.147 0.165 0.181 -0.132 0.193 0.476 0.749 0.038 0.428 0.781 0.954 0.208 0.566 0.719 0.824 -0.154 0.326 0.503 0.679 -0.604 0.313 0.322 0.331 0.019 0.354 0.502 0.645 0.183 0.491 0.674 0.764 0.288 0.596 0.654 0.697 -0.015 0.447 0.535 0.624 -0.273 -0.601 -0.588 -0.577 -0.537 -0.652 -0.328 -0.010 -0.617 -0.478 -0.042 0.176 -0.557 -0.549 -0.179 0.085 -0.536 -0.649 -0.395 -0.137 -0.843 -1.016 -0.981 -0.948 -1.060 -1.095 -0.296 0.478 -1.123 -0.626 0.393 0.895 -0.745 -0.539 0.137 0.605 -0.996 -0.928 -0.344 0.240 -0.193 0.276 0.277 0.278 0.138 0.317 0.389 0.464 0.245 0.421 0.524 0.581 0.355 0.546 0.576 0.599 0.132 0.390 0.420 0.451 0.038 1.035 1.021 1.009 0.751 1.074 1.041 1.015 0.913 1.088 1.056 1.046 0.922 1.116 1.016 0.954 0.718 1.116 1.059 1.010 0.105 1.146 1.125 1.108 0.842 1.168 1.068 0.981 0.989 1.124 1.019 0.973 0.952 1.134 1.001 0.918 0.787 1.178 1.080 0.993 -0.455 -0.402 -0.390 -0.377 -0.453 -0.423 -0.072 0.274 -0.451 -0.196 0.269 0.504 -0.313 -0.157 0.195 0.444 -0.454 -0.358 -0.094 0.174 -0.476 3.386 3.386 3.386 2.572 4.150 2.572 1.001 3.433 3.428 1.388 0.388 2.458 3.392 1.806 0.682 2.144 4.025 2.777 1.490 -0.456 3.136 3.136 3.136 2.363 3.822 2.363 0.906 3.203 3.201 1.279 0.327 2.213 3.071 1.611 0.569 1.965 3.710 2.553 1.354 0.346 -0.121 0.009 0.093 0.119 0.098 0.131 0.158 0.207 0.197 0.214 0.244 0.720 -0.475 -0.255 -0.129 -0.027 -0.132 -0.055 0.006 0.060 0.022 0.070 0.121 -0.819 -1.502 -1.504 -0.759 -0.935 -1.322 -1.664 -1.888 -0.985 -1.508 -1.686 -1.193 1.012 1.717 1.989 1.453 -0.641 -0.613 -0.585 -0.554 0.142 -0.811 -0.093 1.407 -0.368 -0.368 -0.368 2.593 0.056 -0.833 -0.163 1.847 -0.275 -0.275 6.194 -0.275 0.870 -0.435 -0.435 -0.435
18 -0.346 -0.330 0.538 -0.363 0.021 0.220 0.290 1.622 1.745 1.654 2.005 1.619 1.727 1.637 2.027 -0.776 -0.402 -0.269 0.337 0.102 0.052 -0.046 0.220 -0.432 -0.214 -0.178 -0.312 -0.463 -0.291 -0.208 0.652 1.267 -1.112 0.469 0.097 -1.477 -1.390 -0.777 1.695 1.354 2.053 1.212 2.956 -0.385 -0.055 -0.154 1.196 -1.077 -0.763 0.039 2.045 -0.590 -0.428 -0.320 0.410 -0.146 -0.146 -0.146 -0.146 -0.197 -0.197 -0.197 -0.197 0.989 4.820 1.002 1.913 1.396 1.295 0.617 3.347 0.959 1.000 -1.000 1.043 -0.545 -0.545 1.949 -0.480 -0.150 -0.126 -0.103 -0.205 -0.128 -0.149 -0.169 -0.185 -0.150 -0.149 -0.147 0.076 -0.061 -0.003 0.054 0.114 -0.143 -0.115 -0.091 -0.071 -0.986 -0.958 -0.931 0.641 -1.047 -0.444 0.139 0.704 -0.351 0.035 0.408 1.123 -0.081 0.286 0.727 1.265 -0.696 -0.365 0.149 0.966 -0.562 -0.550 -0.538 -0.447 -0.609 -0.564 -0.520 -0.479 -0.598 -0.562 -0.529 -0.399 -0.862 -0.789 -0.706 -0.609 -0.696 -0.653 -0.597 -0.515 -0.200 -0.176 -0.153 -0.171 -0.190 -0.179 -0.168 -0.158 -0.166 -0.148 -0.131 0.045 -0.161 -0.110 -0.059 -0.006 -0.193 -0.158 -0.121 -0.080 -0.020 -0.019 -0.018 -0.674 -0.022 -0.272 -0.521 -0.763 -0.371 -0.538 -0.698 -0.756 -0.399 -0.500 -0.627 -0.789 -0.197 -0.309 -0.496 -0.806 0.474 0.479 0.485 0.177 0.509 0.400 0.299 0.214 0.396 0.342 0.298 0.314 0.511 0.485 0.456 0.423 0.477 0.438 0.375 0.275 0.235 0.251 0.267 0.390 0.267 0.321 0.371 0.423 0.341 0.379 0.419 0.504 0.461 0.499 0.544 0.599 0.324 0.360 0.408 0.478 -0.242 -0.231 -0.219 -0.374 -0.243 -0.293 -0.342 -0.389 -0.309 -0.334 -0.358 -0.348 -0.514 -0.513 -0.520 -0.537 -0.333 -0.345 -0.375 -0.435 0.782 0.782 0.782 -0.476 0.959 0.445 -0.070 -0.571 0.328 0.000 -0.318 -0.300 0.623 0.429 0.184 -0.126 0.743 0.521 0.153 -0.454 0.073 0.073 0.073 -0.456 0.089 -0.126 -0.341 -0.550 -0.226 -0.364 -0.498 -0.066 0.128 0.146 0.168 0.197 -0.004 -0.060 -0.152 -0.305 -0.505 -0.094 0.030 0.476 0.035 0.238 0.394 0.525 0.306 0.406 0.493 0.541 0.795 0.480 0.461 0.521 0.465 0.491 0.523 0.558 0.512 0.542 0.574 0.498 0.032 0.573 0.330 0.393 0.149 0.008 0.158 0.239 0.073 0.185 0.136 0.015 -0.286 -0.338 -0.218 -0.007 -0.641 -0.576 -0.585 -0.554 -0.735 0.387 -0.198 0.679 -0.368 -0.368 -0.368 -0.368 -0.763 0.285 -0.261 0.558 -0.275 -0.275 -0.275 -0.275 -0.435 -0.435 -0.435 -0.435

Building Linear Regression model for cameraaccessory

In [353]:
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
from sklearn.metrics import mean_squared_error

cameraaccessory_dladd_model = LinearRegression().fit(X_cameraaccessory_dladd_train, y_cameraaccessory_dladd_train)
y_cameraaccessory_dladd_test_pred = cameraaccessory_dladd_model.predict(X_cameraaccessory_dladd_test)

print('R2 Score: {}'.format(r2_score(y_cameraaccessory_dladd_test, y_cameraaccessory_dladd_test_pred)))
print('Mean Squared Error: {}'.format(mean_squared_error(y_cameraaccessory_dladd_test, y_cameraaccessory_dladd_test_pred)))
R2 Score: 0.8742548278608485
Mean Squared Error: 0.12444954165378147
With Simple Linear Regression, we get a r2 score of 0.87 and mse of 0.12

Building Linear Regression model for cameraaccessory using K-fold Cross Validation

We will use GridSearchCV method and 5 fold cross validation method for our linear regression.

In [354]:
y_cameraaccessory_dladd = cameraaccessory_dladd_df.pop('gmv')
X_cameraaccessory_dladd = cameraaccessory_dladd_df
In [355]:
# Make cross validated predictions
from sklearn.model_selection import cross_val_score,cross_val_predict
from sklearn import metrics

cameraaccessory_dladd_model_cv = LinearRegression().fit(X_cameraaccessory_dladd, y_cameraaccessory_dladd)
cameraaccessory_dladd_predictions_cv = cross_val_predict(cameraaccessory_dladd_model_cv, X_cameraaccessory_dladd, \
                                                         y_cameraaccessory_dladd, cv=10)
accuracy = metrics.r2_score(y_cameraaccessory_dladd, cameraaccessory_dladd_predictions_cv)
print("Cross-Predicted Accuracy:", accuracy)
print('Mean Squared Error: {}'.format(mean_squared_error(y_cameraaccessory_dladd, cameraaccessory_dladd_predictions_cv)))
Cross-Predicted Accuracy: 0.8246871357377356
Mean Squared Error: 0.17531286426226445
With Simple Linear Regression, using cross validation, we get r2 score of 0.82 and mse score of 0.17

Determining Feature Importance for cameraaccessory with model with cv

In [356]:
# linear regression model parameters
#Limiting floats output to 3 decimal points
pd.set_option('display.float_format', lambda x: '{:.3f}'.format(x)) 
pd.set_option('display.precision',1)


cameraaccessory_lr_model_parameters = list(cameraaccessory_dladd_model_cv.coef_)
cameraaccessory_lr_model_parameters.insert(0, cameraaccessory_dladd_model_cv.intercept_)
cameraaccessory_lr_model_parameters = [round(x, 3) for x in cameraaccessory_lr_model_parameters]
cols = X_cameraaccessory_dladd_test.columns
cols = cols.insert(0, "constant")
cameraaccessory_lr_coef = list(zip(cols, cameraaccessory_lr_model_parameters))
cameraaccessory_lr_coef
Out[356]:
[('constant', 0.0),
 ('gmv_lag1', -0.018),
 ('gmv_lag2', -0.02),
 ('gmv_lag3', -0.002),
 ('Discount%', 0.001),
 ('Discount%_lag1', 0.006),
 ('Discount%_lag2', 0.01),
 ('Discount%_lag3', 0.018),
 ('deliverybdays', 0.003),
 ('deliverybdays_lag1', -0.002),
 ('deliverybdays_lag2', -0.001),
 ('deliverybdays_lag3', 0.005),
 ('deliverycdays', 0.002),
 ('deliverycdays_lag1', -0.003),
 ('deliverycdays_lag2', -0.0),
 ('deliverycdays_lag3', 0.003),
 ('sla', -0.004),
 ('sla_lag1', -0.001),
 ('sla_lag2', 0.006),
 ('sla_lag3', 0.003),
 ('product_procurement_sla', 0.03),
 ('product_procurement_sla_lag1', 0.017),
 ('product_procurement_sla_lag2', -0.006),
 ('product_procurement_sla_lag3', -0.016),
 ('is_cod', 0.063),
 ('is_cod_lag1', -0.009),
 ('is_cod_lag2', 0.017),
 ('is_cod_lag3', 0.0),
 ('is_mass_market', 0.079),
 ('is_mass_market_lag1', -0.018),
 ('is_mass_market_lag2', 0.0),
 ('is_mass_market_lag3', 0.001),
 ('product_vertical_cameraaccessory', 0.051),
 ('product_vertical_cameraaccessory_lag1', -0.005),
 ('product_vertical_cameraaccessory_lag2', 0.001),
 ('product_vertical_cameraaccessory_lag3', -0.002),
 ('product_vertical_camerabag', 0.102),
 ('product_vertical_camerabag_lag1', 0.003),
 ('product_vertical_camerabag_lag2', -0.025),
 ('product_vertical_camerabag_lag3', -0.001),
 ('product_vertical_camerabattery', 0.07),
 ('product_vertical_camerabattery_lag1', 0.018),
 ('product_vertical_camerabattery_lag2', -0.015),
 ('product_vertical_camerabattery_lag3', -0.01),
 ('product_vertical_camerabatterycharger', 0.044),
 ('product_vertical_camerabatterycharger_lag1', 0.014),
 ('product_vertical_camerabatterycharger_lag2', -0.022),
 ('product_vertical_camerabatterycharger_lag3', 0.004),
 ('product_vertical_camerabatterygrip', 0.016),
 ('product_vertical_camerabatterygrip_lag1', -0.0),
 ('product_vertical_camerabatterygrip_lag2', 0.023),
 ('product_vertical_camerabatterygrip_lag3', -0.001),
 ('product_vertical_cameraeyecup', 0.001),
 ('product_vertical_cameraeyecup_lag1', -0.006),
 ('product_vertical_cameraeyecup_lag2', 0.009),
 ('product_vertical_cameraeyecup_lag3', 0.014),
 ('product_vertical_camerafilmrolls', 0.013),
 ('product_vertical_camerafilmrolls_lag1', -0.006),
 ('product_vertical_camerafilmrolls_lag2', -0.01),
 ('product_vertical_camerafilmrolls_lag3', 0.005),
 ('product_vertical_camerahousing', 0.042),
 ('product_vertical_camerahousing_lag1', -0.035),
 ('product_vertical_camerahousing_lag2', -0.022),
 ('product_vertical_camerahousing_lag3', 0.019),
 ('product_vertical_cameraledlight', -0.015),
 ('product_vertical_cameraledlight_lag1', -0.012),
 ('product_vertical_cameraledlight_lag2', 0.012),
 ('product_vertical_cameraledlight_lag3', -0.001),
 ('product_vertical_cameramicrophone', 0.003),
 ('product_vertical_cameramicrophone_lag1', -0.003),
 ('product_vertical_cameramicrophone_lag2', -0.004),
 ('product_vertical_cameramicrophone_lag3', -0.006),
 ('product_vertical_cameramount', 0.016),
 ('product_vertical_cameramount_lag1', 0.005),
 ('product_vertical_cameramount_lag2', 0.005),
 ('product_vertical_cameramount_lag3', -0.003),
 ('product_vertical_cameraremotecontrol', 0.094),
 ('product_vertical_cameraremotecontrol_lag1', -0.003),
 ('product_vertical_cameraremotecontrol_lag2', -0.022),
 ('product_vertical_cameraremotecontrol_lag3', 0.011),
 ('product_vertical_cameratripod', 0.078),
 ('product_vertical_cameratripod_lag1', -0.046),
 ('product_vertical_cameratripod_lag2', 0.056),
 ('product_vertical_cameratripod_lag3', -0.02),
 ('product_vertical_extensiontube', 0.021),
 ('product_vertical_extensiontube_lag1', -0.028),
 ('product_vertical_extensiontube_lag2', 0.001),
 ('product_vertical_extensiontube_lag3', -0.026),
 ('product_vertical_filter', 0.117),
 ('product_vertical_filter_lag1', -0.028),
 ('product_vertical_filter_lag2', -0.024),
 ('product_vertical_filter_lag3', 0.008),
 ('product_vertical_flash', 0.016),
 ('product_vertical_flash_lag1', -0.013),
 ('product_vertical_flash_lag2', -0.001),
 ('product_vertical_flash_lag3', 0.014),
 ('product_vertical_flashshoeadapter', -0.036),
 ('product_vertical_flashshoeadapter_lag1', 0.024),
 ('product_vertical_flashshoeadapter_lag2', -0.008),
 ('product_vertical_flashshoeadapter_lag3', -0.023),
 ('product_vertical_lens', 0.121),
 ('product_vertical_lens_lag1', -0.029),
 ('product_vertical_lens_lag2', -0.023),
 ('product_vertical_lens_lag3', -0.001),
 ('product_vertical_reflectorumbrella', 0.006),
 ('product_vertical_reflectorumbrella_lag1', -0.009),
 ('product_vertical_reflectorumbrella_lag2', -0.013),
 ('product_vertical_reflectorumbrella_lag3', 0.024),
 ('product_vertical_softbox', 0.008),
 ('product_vertical_softbox_lag1', 0.014),
 ('product_vertical_softbox_lag2', 0.011),
 ('product_vertical_softbox_lag3', 0.036),
 ('product_vertical_strap', 0.028),
 ('product_vertical_strap_lag1', 0.012),
 ('product_vertical_strap_lag2', -0.001),
 ('product_vertical_strap_lag3', 0.006),
 ('product_vertical_teleconverter', 0.021),
 ('product_vertical_teleconverter_lag1', -0.004),
 ('product_vertical_teleconverter_lag2', -0.008),
 ('product_vertical_teleconverter_lag3', -0.01),
 ('product_vertical_telescope', 0.048),
 ('product_vertical_telescope_lag1', 0.002),
 ('product_vertical_telescope_lag2', -0.015),
 ('product_vertical_telescope_lag3', 0.013),
 ('payday_week', -0.009),
 ('payday_week_lag1', 0.018),
 ('payday_week_lag2', -0.001),
 ('payday_week_lag3', -0.001),
 ('holiday_week', -0.009),
 ('holiday_week_lag1', -0.009),
 ('holiday_week_lag2', -0.003),
 ('holiday_week_lag3', 0.019),
 ('Total Investment', 0.007),
 ('Total Investment_lag1', 0.007),
 ('Total Investment_lag2', 0.002),
 ('Total Investment_lag3', -0.019),
 ('Total Investment_SMA_3', 0.006),
 ('Total Investment_SMA_3_lag1', -0.004),
 ('Total Investment_SMA_3_lag2', -0.007),
 ('Total Investment_SMA_3_lag3', 0.002),
 ('Total Investment_SMA_5', -0.001),
 ('Total Investment_SMA_5_lag1', 0.002),
 ('Total Investment_SMA_5_lag2', -0.002),
 ('Total Investment_SMA_5_lag3', -0.003),
 ('Total Investment_EMA_8', 0.006),
 ('Total Investment_EMA_8_lag1', 0.006),
 ('Total Investment_EMA_8_lag2', 0.003),
 ('Total Investment_EMA_8_lag3', 0.002),
 ('Total_Investment_Ad_Stock', 0.005),
 ('Total_Investment_Ad_Stock_lag1', 0.003),
 ('Total_Investment_Ad_Stock_lag2', -0.002),
 ('Total_Investment_Ad_Stock_lag3', -0.004),
 ('TV', 0.007),
 ('TV_lag1', 0.004),
 ('TV_lag2', -0.007),
 ('TV_lag3', -0.011),
 ('TV_SMA_3', 0.001),
 ('TV_SMA_3_lag1', -0.005),
 ('TV_SMA_3_lag2', 0.001),
 ('TV_SMA_3_lag3', 0.008),
 ('TV_SMA_5', 0.003),
 ('TV_SMA_5_lag1', 0.005),
 ('TV_SMA_5_lag2', -0.001),
 ('TV_SMA_5_lag3', 0.001),
 ('TV_EMA_8', 0.005),
 ('TV_EMA_8_lag1', 0.004),
 ('TV_EMA_8_lag2', 0.003),
 ('TV_EMA_8_lag3', 0.006),
 ('TV_Ad_Stock', 0.004),
 ('TV_Ad_Stock_lag1', 0.001),
 ('TV_Ad_Stock_lag2', -0.002),
 ('TV_Ad_Stock_lag3', 0.003),
 ('Digital', -0.011),
 ('Digital_lag1', 0.022),
 ('Digital_lag2', -0.009),
 ('Digital_lag3', -0.012),
 ('Digital_SMA_3', 0.001),
 ('Digital_SMA_3_lag1', -0.001),
 ('Digital_SMA_3_lag2', -0.01),
 ('Digital_SMA_3_lag3', 0.003),
 ('Digital_SMA_5', -0.005),
 ('Digital_SMA_5_lag1', 0.004),
 ('Digital_SMA_5_lag2', 0.001),
 ('Digital_SMA_5_lag3', -0.001),
 ('Digital_EMA_8', 0.002),
 ('Digital_EMA_8_lag1', 0.012),
 ('Digital_EMA_8_lag2', 0.001),
 ('Digital_EMA_8_lag3', 0.003),
 ('Digital_Ad_Stock', -0.001),
 ('Digital_Ad_Stock_lag1', 0.009),
 ('Digital_Ad_Stock_lag2', -0.006),
 ('Digital_Ad_Stock_lag3', -0.001),
 ('Sponsorship', 0.01),
 ('Sponsorship_lag1', 0.001),
 ('Sponsorship_lag2', 0.009),
 ('Sponsorship_lag3', -0.023),
 ('Sponsorship_SMA_3', 0.007),
 ('Sponsorship_SMA_3_lag1', -0.006),
 ('Sponsorship_SMA_3_lag2', -0.008),
 ('Sponsorship_SMA_3_lag3', -0.001),
 ('Sponsorship_SMA_5', -0.002),
 ('Sponsorship_SMA_5_lag1', 0.001),
 ('Sponsorship_SMA_5_lag2', -0.003),
 ('Sponsorship_SMA_5_lag3', -0.006),
 ('Sponsorship_EMA_8', 0.006),
 ('Sponsorship_EMA_8_lag1', 0.004),
 ('Sponsorship_EMA_8_lag2', 0.004),
 ('Sponsorship_EMA_8_lag3', 0.0),
 ('Sponsorship_Ad_Stock', 0.005),
 ('Sponsorship_Ad_Stock_lag1', 0.0),
 ('Sponsorship_Ad_Stock_lag2', -0.001),
 ('Sponsorship_Ad_Stock_lag3', -0.008),
 ('Content Marketing', -0.008),
 ('Content Marketing_lag1', 0.015),
 ('Content Marketing_lag2', -0.004),
 ('Content Marketing_lag3', -0.013),
 ('Content Marketing_SMA_3', 0.001),
 ('Content Marketing_SMA_3_lag1', -0.001),
 ('Content Marketing_SMA_3_lag2', -0.006),
 ('Content Marketing_SMA_3_lag3', 0.003),
 ('Content Marketing_SMA_5', -0.002),
 ('Content Marketing_SMA_5_lag1', 0.005),
 ('Content Marketing_SMA_5_lag2', -0.001),
 ('Content Marketing_SMA_5_lag3', -0.004),
 ('Content Marketing_EMA_8', 0.003),
 ('Content Marketing_EMA_8_lag1', 0.007),
 ('Content Marketing_EMA_8_lag2', 0.002),
 ('Content Marketing_EMA_8_lag3', 0.004),
 ('Content_Marketing_Ad_Stock', -0.0),
 ('Content_Marketing_Ad_Stock_lag1', 0.006),
 ('Content_Marketing_Ad_Stock_lag2', -0.003),
 ('Content_Marketing_Ad_Stock_lag3', -0.001),
 ('Online marketing', 0.016),
 ('Online marketing_lag1', 0.006),
 ('Online marketing_lag2', -0.001),
 ('Online marketing_lag3', -0.008),
 ('Online marketing_SMA_3', 0.008),
 ('Online marketing_SMA_3_lag1', -0.001),
 ('Online marketing_SMA_3_lag2', -0.002),
 ('Online marketing_SMA_3_lag3', 0.007),
 ('Online marketing_SMA_5', 0.003),
 ('Online marketing_SMA_5_lag1', 0.005),
 ('Online marketing_SMA_5_lag2', 0.003),
 ('Online marketing_SMA_5_lag3', 0.003),
 ('Online marketing_EMA_8', 0.007),
 ('Online marketing_EMA_8_lag1', 0.005),
 ('Online marketing_EMA_8_lag2', 0.004),
 ('Online marketing_EMA_8_lag3', 0.004),
 ('Online_marketing_Ad_Stock', 0.009),
 ('Online_marketing_Ad_Stock_lag1', 0.003),
 ('Online_marketing_Ad_Stock_lag2', 0.001),
 ('Online_marketing_Ad_Stock_lag3', 0.002),
 ('Affiliates', 0.017),
 ('Affiliates_lag1', 0.006),
 ('Affiliates_lag2', -0.003),
 ('Affiliates_lag3', -0.004),
 ('Affiliates_SMA_3', 0.007),
 ('Affiliates_SMA_3_lag1', -0.001),
 ('Affiliates_SMA_3_lag2', -0.001),
 ('Affiliates_SMA_3_lag3', 0.009),
 ('Affiliates_SMA_5', 0.004),
 ('Affiliates_SMA_5_lag1', 0.006),
 ('Affiliates_SMA_5_lag2', 0.004),
 ('Affiliates_SMA_5_lag3', 0.004),
 ('Affiliates_EMA_8', 0.007),
 ('Affiliates_EMA_8_lag1', 0.005),
 ('Affiliates_EMA_8_lag2', 0.004),
 ('Affiliates_EMA_8_lag3', 0.005),
 ('Affiliates_Ad_Stock', 0.009),
 ('Affiliates_Ad_Stock_lag1', 0.003),
 ('Affiliates_Ad_Stock_lag2', 0.001),
 ('Affiliates_Ad_Stock_lag3', 0.004),
 ('SEM', -0.006),
 ('SEM_lag1', 0.019),
 ('SEM_lag2', -0.006),
 ('SEM_lag3', -0.013),
 ('SEM_SMA_3', 0.003),
 ('SEM_SMA_3_lag1', -0.002),
 ('SEM_SMA_3_lag2', -0.009),
 ('SEM_SMA_3_lag3', 0.004),
 ('SEM_SMA_5', -0.003),
 ('SEM_SMA_5_lag1', 0.005),
 ('SEM_SMA_5_lag2', 0.001),
 ('SEM_SMA_5_lag3', -0.0),
 ('SEM_EMA_8', 0.004),
 ('SEM_EMA_8_lag1', 0.011),
 ('SEM_EMA_8_lag2', 0.002),
 ('SEM_EMA_8_lag3', 0.004),
 ('SEM_Ad_Stock', 0.002),
 ('SEM_Ad_Stock_lag1', 0.008),
 ('SEM_Ad_Stock_lag2', -0.004),
 ('SEM_Ad_Stock_lag3', -0.001),
 ('Radio', -0.002),
 ('Radio_lag1', -0.003),
 ('Radio_lag2', -0.001),
 ('Radio_lag3', 0.003),
 ('Radio_SMA_3', -0.002),
 ('Radio_SMA_3_lag1', -0.0),
 ('Radio_SMA_3_lag2', 0.0),
 ('Radio_SMA_3_lag3', -0.006),
 ('Radio_SMA_5', -0.004),
 ('Radio_SMA_5_lag1', -0.005),
 ('Radio_SMA_5_lag2', -0.003),
 ('Radio_SMA_5_lag3', -0.004),
 ('Radio_EMA_8', -0.004),
 ('Radio_EMA_8_lag1', -0.003),
 ('Radio_EMA_8_lag2', -0.001),
 ('Radio_EMA_8_lag3', -0.002),
 ('Radio_Ad_Stock', -0.003),
 ('Radio_Ad_Stock_lag1', -0.003),
 ('Radio_Ad_Stock_lag2', -0.001),
 ('Radio_Ad_Stock_lag3', -0.002),
 ('Other', -0.004),
 ('Other_lag1', -0.001),
 ('Other_lag2', -0.003),
 ('Other_lag3', 0.001),
 ('Other_SMA_3', -0.003),
 ('Other_SMA_3_lag1', -0.001),
 ('Other_SMA_3_lag2', 0.001),
 ('Other_SMA_3_lag3', -0.005),
 ('Other_SMA_5', -0.003),
 ('Other_SMA_5_lag1', -0.005),
 ('Other_SMA_5_lag2', -0.005),
 ('Other_SMA_5_lag3', -0.006),
 ('Other_EMA_8', -0.004),
 ('Other_EMA_8_lag1', -0.003),
 ('Other_EMA_8_lag2', -0.002),
 ('Other_EMA_8_lag3', -0.002),
 ('Other_Ad_Stock', -0.004),
 ('Other_Ad_Stock_lag1', -0.002),
 ('Other_Ad_Stock_lag2', -0.002),
 ('Other_Ad_Stock_lag3', -0.002),
 ('NPS', -0.028),
 ('NPS_lag1', 0.014),
 ('NPS_lag2', 0.007),
 ('NPS_lag3', -0.004),
 ('NPS_SMA_3', 0.002),
 ('NPS_SMA_3_lag1', -0.006),
 ('NPS_SMA_3_lag2', -0.015),
 ('NPS_SMA_3_lag3', -0.007),
 ('NPS_SMA_5', -0.017),
 ('NPS_SMA_5_lag1', -0.006),
 ('NPS_SMA_5_lag2', 0.012),
 ('NPS_SMA_5_lag3', 0.01),
 ('Stock Index', -0.009),
 ('Stock Index_lag1', 0.021),
 ('Stock Index_lag2', -0.0),
 ('Stock Index_lag3', 0.007),
 ('Stock Index_SMA_3', 0.004),
 ('Stock Index_SMA_3_lag1', -0.002),
 ('Stock Index_SMA_3_lag2', -0.012),
 ('Stock Index_SMA_3_lag3', -0.0),
 ('Stock Index_SMA_5', -0.013),
 ('Stock Index_SMA_5_lag1', -0.002),
 ('Stock Index_SMA_5_lag2', 0.014),
 ('Stock Index_SMA_5_lag3', 0.013),
 ('Max Temp', 0.016),
 ('Max Temp_lag1', -0.016),
 ('Max Temp_lag2', 0.017),
 ('Max Temp_lag3', -0.02),
 ('Min Temp', -0.016),
 ('Min Temp_lag1', -0.015),
 ('Min Temp_lag2', -0.003),
 ('Min Temp_lag3', 0.001),
 ('Mean Temp', 0.005),
 ('Mean Temp_lag1', -0.02),
 ('Mean Temp_lag2', 0.008),
 ('Mean Temp_lag3', -0.013),
 ('Heat Deg Days', -0.002),
 ('Heat Deg Days_lag1', 0.026),
 ('Heat Deg Days_lag2', -0.012),
 ('Heat Deg Days_lag3', 0.012),
 ('Cool Deg Days', 0.011),
 ('Cool Deg Days_lag1', -0.019),
 ('Cool Deg Days_lag2', -0.018),
 ('Cool Deg Days_lag3', -0.006),
 ('Total Rain (mm)', 0.006),
 ('Total Rain (mm)_lag1', -0.001),
 ('Total Rain (mm)_lag2', 0.016),
 ('Total Rain (mm)_lag3', -0.007),
 ('Total Snow (cm)', -0.01),
 ('Total Snow (cm)_lag1', 0.001),
 ('Total Snow (cm)_lag2', 0.004),
 ('Total Snow (cm)_lag3', -0.024),
 ('Total Precip (mm)', 0.003),
 ('Total Precip (mm)_lag1', -0.001),
 ('Total Precip (mm)_lag2', 0.016),
 ('Total Precip (mm)_lag3', -0.012),
 ('Snow on Grnd (cm)', 0.008),
 ('Snow on Grnd (cm)_lag1', -0.012),
 ('Snow on Grnd (cm)_lag2', -0.005),
 ('Snow on Grnd (cm)_lag3', 0.004),
 ('Sale', 0.03),
 ('Sale_lag1', -0.043),
 ('Sale_lag2', -0.012),
 ('Sale_lag3', 0.007)]
In [357]:
cameraaccessory_lr_coef_df = pd.DataFrame(cameraaccessory_lr_coef)
col_rename = {0:'Features',1: 'Coefficients'}
cameraaccessory_lr_coef_df = cameraaccessory_lr_coef_df.rename(columns=col_rename)
cameraaccessory_lr_coef_df = cameraaccessory_lr_coef_df.iloc[1:,:]
cameraaccessory_lr_coef_df = cameraaccessory_lr_coef_df.loc[cameraaccessory_lr_coef_df['Coefficients']!=0.0]
cameraaccessory_lr_coef_df = cameraaccessory_lr_coef_df.sort_values(by=['Coefficients'], ascending = False)
cameraaccessory_lr_coef_df
Out[357]:
Features Coefficients
100 product_vertical_lens 0.121
88 product_vertical_filter 0.117
36 product_vertical_camerabag 0.102
76 product_vertical_cameraremotecontrol 0.094
28 is_mass_market 0.079
80 product_vertical_cameratripod 0.078
40 product_vertical_camerabattery 0.070
24 is_cod 0.063
82 product_vertical_cameratripod_lag2 0.056
32 product_vertical_cameraaccessory 0.051
120 product_vertical_telescope 0.048
44 product_vertical_camerabatterycharger 0.044
60 product_vertical_camerahousing 0.042
111 product_vertical_softbox_lag3 0.036
20 product_procurement_sla 0.030
392 Sale 0.030
112 product_vertical_strap 0.028
369 Heat Deg Days_lag1 0.026
107 product_vertical_reflectorumbrella_lag3 0.024
97 product_vertical_flashshoeadapter_lag1 0.024
50 product_vertical_camerabatterygrip_lag2 0.023
173 Digital_lag1 0.022
116 product_vertical_teleconverter 0.021
345 Stock Index_lag1 0.021
84 product_vertical_extensiontube 0.021
63 product_vertical_camerahousing_lag3 0.019
273 SEM_lag1 0.019
131 holiday_week_lag3 0.019
125 payday_week_lag1 0.018
7 Discount%_lag3 0.018
41 product_vertical_camerabattery_lag1 0.018
21 product_procurement_sla_lag1 0.017
26 is_cod_lag2 0.017
252 Affiliates 0.017
358 Max Temp_lag2 0.017
232 Online marketing 0.016
72 product_vertical_cameramount 0.016
48 product_vertical_camerabatterygrip 0.016
92 product_vertical_flash 0.016
356 Max Temp 0.016
378 Total Rain (mm)_lag2 0.016
386 Total Precip (mm)_lag2 0.016
213 Content Marketing_lag1 0.015
333 NPS_lag1 0.014
55 product_vertical_cameraeyecup_lag3 0.014
45 product_vertical_camerabatterycharger_lag1 0.014
95 product_vertical_flash_lag3 0.014
109 product_vertical_softbox_lag1 0.014
354 Stock Index_SMA_5_lag2 0.014
123 product_vertical_telescope_lag3 0.013
56 product_vertical_camerafilmrolls 0.013
355 Stock Index_SMA_5_lag3 0.013
113 product_vertical_strap_lag1 0.012
185 Digital_EMA_8_lag1 0.012
371 Heat Deg Days_lag3 0.012
342 NPS_SMA_5_lag2 0.012
66 product_vertical_cameraledlight_lag2 0.012
110 product_vertical_softbox_lag2 0.011
372 Cool Deg Days 0.011
79 product_vertical_cameraremotecontrol_lag3 0.011
285 SEM_EMA_8_lag1 0.011
6 Discount%_lag2 0.010
192 Sponsorship 0.010
343 NPS_SMA_5_lag3 0.010
248 Online_marketing_Ad_Stock 0.009
189 Digital_Ad_Stock_lag1 0.009
268 Affiliates_Ad_Stock 0.009
194 Sponsorship_lag2 0.009
54 product_vertical_cameraeyecup_lag2 0.009
259 Affiliates_SMA_3_lag3 0.009
108 product_vertical_softbox 0.008
159 TV_SMA_3_lag3 0.008
289 SEM_Ad_Stock_lag1 0.008
236 Online marketing_SMA_3 0.008
388 Snow on Grnd (cm) 0.008
91 product_vertical_filter_lag3 0.008
366 Mean Temp_lag2 0.008
133 Total Investment_lag1 0.007
347 Stock Index_lag3 0.007
334 NPS_lag2 0.007
132 Total Investment 0.007
256 Affiliates_SMA_3 0.007
152 TV 0.007
225 Content Marketing_EMA_8_lag1 0.007
264 Affiliates_EMA_8 0.007
239 Online marketing_SMA_3_lag3 0.007
244 Online marketing_EMA_8 0.007
196 Sponsorship_SMA_3 0.007
395 Sale_lag3 0.007
376 Total Rain (mm) 0.006
261 Affiliates_SMA_5_lag1 0.006
5 Discount%_lag1 0.006
167 TV_EMA_8_lag3 0.006
18 sla_lag2 0.006
253 Affiliates_lag1 0.006
104 product_vertical_reflectorumbrella 0.006
115 product_vertical_strap_lag3 0.006
145 Total Investment_EMA_8_lag1 0.006
233 Online marketing_lag1 0.006
204 Sponsorship_EMA_8 0.006
144 Total Investment_EMA_8 0.006
229 Content_Marketing_Ad_Stock_lag1 0.006
136 Total Investment_SMA_3 0.006
148 Total_Investment_Ad_Stock 0.005
267 Affiliates_EMA_8_lag3 0.005
161 TV_SMA_5_lag1 0.005
11 deliverybdays_lag3 0.005
164 TV_EMA_8 0.005
73 product_vertical_cameramount_lag1 0.005
245 Online marketing_EMA_8_lag1 0.005
59 product_vertical_camerafilmrolls_lag3 0.005
241 Online marketing_SMA_5_lag1 0.005
208 Sponsorship_Ad_Stock 0.005
221 Content Marketing_SMA_5_lag1 0.005
281 SEM_SMA_5_lag1 0.005
265 Affiliates_EMA_8_lag1 0.005
74 product_vertical_cameramount_lag2 0.005
364 Mean Temp 0.005
168 TV_Ad_Stock 0.004
47 product_vertical_camerabatterycharger_lag3 0.004
181 Digital_SMA_5_lag1 0.004
247 Online marketing_EMA_8_lag3 0.004
263 Affiliates_SMA_5_lag3 0.004
260 Affiliates_SMA_5 0.004
246 Online marketing_EMA_8_lag2 0.004
271 Affiliates_Ad_Stock_lag3 0.004
262 Affiliates_SMA_5_lag2 0.004
382 Total Snow (cm)_lag2 0.004
279 SEM_SMA_3_lag3 0.004
266 Affiliates_EMA_8_lag2 0.004
348 Stock Index_SMA_3 0.004
391 Snow on Grnd (cm)_lag3 0.004
165 TV_EMA_8_lag1 0.004
205 Sponsorship_EMA_8_lag1 0.004
206 Sponsorship_EMA_8_lag2 0.004
284 SEM_EMA_8 0.004
287 SEM_EMA_8_lag3 0.004
227 Content Marketing_EMA_8_lag3 0.004
153 TV_lag1 0.004
384 Total Precip (mm) 0.003
160 TV_SMA_5 0.003
179 Digital_SMA_3_lag3 0.003
166 TV_EMA_8_lag2 0.003
19 sla_lag3 0.003
295 Radio_lag3 0.003
37 product_vertical_camerabag_lag1 0.003
219 Content Marketing_SMA_3_lag3 0.003
187 Digital_EMA_8_lag3 0.003
249 Online_marketing_Ad_Stock_lag1 0.003
15 deliverycdays_lag3 0.003
224 Content Marketing_EMA_8 0.003
68 product_vertical_cameramicrophone 0.003
240 Online marketing_SMA_5 0.003
276 SEM_SMA_3 0.003
269 Affiliates_Ad_Stock_lag1 0.003
8 deliverybdays 0.003
242 Online marketing_SMA_5_lag2 0.003
243 Online marketing_SMA_5_lag3 0.003
146 Total Investment_EMA_8_lag2 0.003
149 Total_Investment_Ad_Stock_lag1 0.003
171 TV_Ad_Stock_lag3 0.003
251 Online_marketing_Ad_Stock_lag3 0.002
184 Digital_EMA_8 0.002
226 Content Marketing_EMA_8_lag2 0.002
12 deliverycdays 0.002
121 product_vertical_telescope_lag1 0.002
141 Total Investment_SMA_5_lag1 0.002
288 SEM_Ad_Stock 0.002
286 SEM_EMA_8_lag2 0.002
336 NPS_SMA_3 0.002
134 Total Investment_lag2 0.002
147 Total Investment_EMA_8_lag3 0.002
139 Total Investment_SMA_3_lag3 0.002
381 Total Snow (cm)_lag1 0.001
4 Discount% 0.001
86 product_vertical_extensiontube_lag2 0.001
34 product_vertical_cameraaccessory_lag2 0.001
169 TV_Ad_Stock_lag1 0.001
201 Sponsorship_SMA_5_lag1 0.001
270 Affiliates_Ad_Stock_lag2 0.001
52 product_vertical_cameraeyecup 0.001
282 SEM_SMA_5_lag2 0.001
31 is_mass_market_lag3 0.001
363 Min Temp_lag3 0.001
193 Sponsorship_lag1 0.001
250 Online_marketing_Ad_Stock_lag2 0.001
182 Digital_SMA_5_lag2 0.001
176 Digital_SMA_3 0.001
163 TV_SMA_5_lag3 0.001
158 TV_SMA_3_lag2 0.001
156 TV_SMA_3 0.001
216 Content Marketing_SMA_3 0.001
315 Other_lag3 0.001
318 Other_SMA_3_lag2 0.001
186 Digital_EMA_8_lag2 0.001
39 product_vertical_camerabag_lag3 -0.001
294 Radio_lag2 -0.001
222 Content Marketing_SMA_5_lag2 -0.001
291 SEM_Ad_Stock_lag3 -0.001
258 Affiliates_SMA_3_lag2 -0.001
257 Affiliates_SMA_3_lag1 -0.001
10 deliverybdays_lag2 -0.001
310 Radio_Ad_Stock_lag2 -0.001
237 Online marketing_SMA_3_lag1 -0.001
306 Radio_EMA_8_lag2 -0.001
317 Other_SMA_3_lag1 -0.001
313 Other_lag1 -0.001
234 Online marketing_lag2 -0.001
51 product_vertical_camerabatterygrip_lag3 -0.001
17 sla_lag1 -0.001
94 product_vertical_flash_lag2 -0.001
231 Content_Marketing_Ad_Stock_lag3 -0.001
199 Sponsorship_SMA_3_lag3 -0.001
67 product_vertical_cameraledlight_lag3 -0.001
127 payday_week_lag3 -0.001
126 payday_week_lag2 -0.001
162 TV_SMA_5_lag2 -0.001
217 Content Marketing_SMA_3_lag1 -0.001
177 Digital_SMA_3_lag1 -0.001
183 Digital_SMA_5_lag3 -0.001
140 Total Investment_SMA_5 -0.001
114 product_vertical_strap_lag2 -0.001
191 Digital_Ad_Stock_lag3 -0.001
377 Total Rain (mm)_lag1 -0.001
188 Digital_Ad_Stock -0.001
103 product_vertical_lens_lag3 -0.001
210 Sponsorship_Ad_Stock_lag2 -0.001
385 Total Precip (mm)_lag1 -0.001
292 Radio -0.002
200 Sponsorship_SMA_5 -0.002
307 Radio_EMA_8_lag3 -0.002
296 Radio_SMA_3 -0.002
9 deliverybdays_lag1 -0.002
170 TV_Ad_Stock_lag2 -0.002
330 Other_Ad_Stock_lag2 -0.002
311 Radio_Ad_Stock_lag3 -0.002
368 Heat Deg Days -0.002
220 Content Marketing_SMA_5 -0.002
35 product_vertical_cameraaccessory_lag3 -0.002
150 Total_Investment_Ad_Stock_lag2 -0.002
142 Total Investment_SMA_5_lag2 -0.002
3 gmv_lag3 -0.002
277 SEM_SMA_3_lag1 -0.002
349 Stock Index_SMA_3_lag1 -0.002
329 Other_Ad_Stock_lag1 -0.002
331 Other_Ad_Stock_lag3 -0.002
327 Other_EMA_8_lag3 -0.002
353 Stock Index_SMA_5_lag1 -0.002
238 Online marketing_SMA_3_lag2 -0.002
326 Other_EMA_8_lag2 -0.002
325 Other_EMA_8_lag1 -0.003
280 SEM_SMA_5 -0.003
69 product_vertical_cameramicrophone_lag1 -0.003
130 holiday_week_lag2 -0.003
314 Other_lag2 -0.003
309 Radio_Ad_Stock_lag1 -0.003
362 Min Temp_lag2 -0.003
320 Other_SMA_5 -0.003
77 product_vertical_cameraremotecontrol_lag1 -0.003
302 Radio_SMA_5_lag2 -0.003
305 Radio_EMA_8_lag1 -0.003
316 Other_SMA_3 -0.003
308 Radio_Ad_Stock -0.003
75 product_vertical_cameramount_lag3 -0.003
13 deliverycdays_lag1 -0.003
293 Radio_lag1 -0.003
254 Affiliates_lag2 -0.003
202 Sponsorship_SMA_5_lag2 -0.003
143 Total Investment_SMA_5_lag3 -0.003
230 Content_Marketing_Ad_Stock_lag2 -0.003
328 Other_Ad_Stock -0.004
303 Radio_SMA_5_lag3 -0.004
304 Radio_EMA_8 -0.004
324 Other_EMA_8 -0.004
290 SEM_Ad_Stock_lag2 -0.004
255 Affiliates_lag3 -0.004
16 sla -0.004
335 NPS_lag3 -0.004
312 Other -0.004
70 product_vertical_cameramicrophone_lag2 -0.004
151 Total_Investment_Ad_Stock_lag3 -0.004
137 Total Investment_SMA_3_lag1 -0.004
223 Content Marketing_SMA_5_lag3 -0.004
117 product_vertical_teleconverter_lag1 -0.004
214 Content Marketing_lag2 -0.004
300 Radio_SMA_5 -0.004
321 Other_SMA_5_lag1 -0.005
301 Radio_SMA_5_lag1 -0.005
322 Other_SMA_5_lag2 -0.005
319 Other_SMA_3_lag3 -0.005
157 TV_SMA_3_lag1 -0.005
33 product_vertical_cameraaccessory_lag1 -0.005
390 Snow on Grnd (cm)_lag2 -0.005
180 Digital_SMA_5 -0.005
203 Sponsorship_SMA_5_lag3 -0.006
375 Cool Deg Days_lag3 -0.006
337 NPS_SMA_3_lag1 -0.006
341 NPS_SMA_5_lag1 -0.006
197 Sponsorship_SMA_3_lag1 -0.006
22 product_procurement_sla_lag2 -0.006
53 product_vertical_cameraeyecup_lag1 -0.006
323 Other_SMA_5_lag3 -0.006
190 Digital_Ad_Stock_lag2 -0.006
57 product_vertical_camerafilmrolls_lag1 -0.006
274 SEM_lag2 -0.006
218 Content Marketing_SMA_3_lag2 -0.006
71 product_vertical_cameramicrophone_lag3 -0.006
299 Radio_SMA_3_lag3 -0.006
272 SEM -0.006
154 TV_lag2 -0.007
339 NPS_SMA_3_lag3 -0.007
379 Total Rain (mm)_lag3 -0.007
138 Total Investment_SMA_3_lag2 -0.007
212 Content Marketing -0.008
118 product_vertical_teleconverter_lag2 -0.008
235 Online marketing_lag3 -0.008
98 product_vertical_flashshoeadapter_lag2 -0.008
211 Sponsorship_Ad_Stock_lag3 -0.008
198 Sponsorship_SMA_3_lag2 -0.008
278 SEM_SMA_3_lag2 -0.009
129 holiday_week_lag1 -0.009
128 holiday_week -0.009
174 Digital_lag2 -0.009
105 product_vertical_reflectorumbrella_lag1 -0.009
344 Stock Index -0.009
25 is_cod_lag1 -0.009
124 payday_week -0.009
380 Total Snow (cm) -0.010
58 product_vertical_camerafilmrolls_lag2 -0.010
178 Digital_SMA_3_lag2 -0.010
119 product_vertical_teleconverter_lag3 -0.010
43 product_vertical_camerabattery_lag3 -0.010
155 TV_lag3 -0.011
172 Digital -0.011
389 Snow on Grnd (cm)_lag1 -0.012
175 Digital_lag3 -0.012
394 Sale_lag2 -0.012
350 Stock Index_SMA_3_lag2 -0.012
387 Total Precip (mm)_lag3 -0.012
370 Heat Deg Days_lag2 -0.012
65 product_vertical_cameraledlight_lag1 -0.012
352 Stock Index_SMA_5 -0.013
367 Mean Temp_lag3 -0.013
106 product_vertical_reflectorumbrella_lag2 -0.013
93 product_vertical_flash_lag1 -0.013
215 Content Marketing_lag3 -0.013
275 SEM_lag3 -0.013
361 Min Temp_lag1 -0.015
338 NPS_SMA_3_lag2 -0.015
42 product_vertical_camerabattery_lag2 -0.015
64 product_vertical_cameraledlight -0.015
122 product_vertical_telescope_lag2 -0.015
360 Min Temp -0.016
357 Max Temp_lag1 -0.016
23 product_procurement_sla_lag3 -0.016
340 NPS_SMA_5 -0.017
374 Cool Deg Days_lag2 -0.018
1 gmv_lag1 -0.018
29 is_mass_market_lag1 -0.018
373 Cool Deg Days_lag1 -0.019
135 Total Investment_lag3 -0.019
365 Mean Temp_lag1 -0.020
83 product_vertical_cameratripod_lag3 -0.020
359 Max Temp_lag3 -0.020
2 gmv_lag2 -0.020
62 product_vertical_camerahousing_lag2 -0.022
78 product_vertical_cameraremotecontrol_lag2 -0.022
46 product_vertical_camerabatterycharger_lag2 -0.022
195 Sponsorship_lag3 -0.023
102 product_vertical_lens_lag2 -0.023
99 product_vertical_flashshoeadapter_lag3 -0.023
90 product_vertical_filter_lag2 -0.024
383 Total Snow (cm)_lag3 -0.024
38 product_vertical_camerabag_lag2 -0.025
87 product_vertical_extensiontube_lag3 -0.026
332 NPS -0.028
85 product_vertical_extensiontube_lag1 -0.028
89 product_vertical_filter_lag1 -0.028
101 product_vertical_lens_lag1 -0.029
61 product_vertical_camerahousing_lag1 -0.035
96 product_vertical_flashshoeadapter -0.036
393 Sale_lag1 -0.043
81 product_vertical_cameratripod_lag1 -0.046

Plotting the Features in descending order of Importance for cameraaccessory

In [358]:
# Slightly alter the figure size to make it more horizontal.
plt.figure(figsize=(10, 35), dpi=100, facecolor='w', edgecolor='k', frameon='True')
sns.barplot(y='Features', x='Coefficients', palette='husl', data=cameraaccessory_lr_coef_df, estimator=np.sum)
# Automatically adjust subplot params so that the subplotS fits in to the figure area.
plt.tight_layout()

# display the plot
plt.show()
The 5 most important features affecting GMV(Revenue) for cameraaccessory are:
Features Coefficients
product_vertical_lens 0.121
product_vertical_filter 0.117
product_vertical_camerabag 0.102
product_vertical_cameraremotecontrol 0.094
is_mass_market 0.079

Building Linear Regression model for gamingaccessory

In [359]:
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
from sklearn.metrics import mean_squared_error

gamingaccessory_dladd_model = LinearRegression().fit(X_gamingaccessory_dladd_train, y_gamingaccessory_dladd_train)
y_gamingaccessory_dladd_test_pred = gamingaccessory_dladd_model.predict(X_gamingaccessory_dladd_test)

print('R2 Score: {}'.format(r2_score(y_gamingaccessory_dladd_test, y_gamingaccessory_dladd_test_pred)))
print('Mean Squared Error: {}'.format(mean_squared_error(y_gamingaccessory_dladd_test, y_gamingaccessory_dladd_test_pred)))
R2 Score: 0.8743788823250918
Mean Squared Error: 0.10053947423668204
With Simple Linear Regression, we get a r2 score of 0.87 and mse of 0.10

Building Linear Regression model for gamingaccessory using K-fold Cross Validation

We will use GridSearchCV method and 5 fold cross validation method for our linear regression.

In [360]:
y_gamingaccessory_dladd = gamingaccessory_dladd_df.pop('gmv')
X_gamingaccessory_dladd = gamingaccessory_dladd_df
In [361]:
# Make cross validated predictions
from sklearn.model_selection import cross_val_score,cross_val_predict
from sklearn import metrics

gamingaccessory_dladd_model_cv = LinearRegression().fit(X_gamingaccessory_dladd, y_gamingaccessory_dladd)
gamingaccessory_dladd_predictions_cv = cross_val_predict(gamingaccessory_dladd_model_cv, X_gamingaccessory_dladd, \
                                                         y_gamingaccessory_dladd, cv=10)
accuracy = metrics.r2_score(y_gamingaccessory_dladd, gamingaccessory_dladd_predictions_cv)
print("Cross-Predicted Accuracy:", accuracy)
print('Mean Squared Error: {}'.format(mean_squared_error(y_gamingaccessory_dladd, gamingaccessory_dladd_predictions_cv)))
Cross-Predicted Accuracy: 0.919259247127848
Mean Squared Error: 0.08074075287215199
With Simple Linear Regression, using cross validation, we get a r2 score of 0.92 and mse score of 0.08

Determining Feature Importance for gamingaccessory with model with cv

In [362]:
# linear regression model parameters
#Limiting floats output to 3 decimal points
pd.set_option('display.float_format', lambda x: '{:.3f}'.format(x)) 
pd.set_option('display.precision',1)


gamingaccessory_lr_model_parameters = list(gamingaccessory_dladd_model_cv.coef_)
gamingaccessory_lr_model_parameters.insert(0, gamingaccessory_dladd_model_cv.intercept_)
gamingaccessory_lr_model_parameters = [round(x, 3) for x in gamingaccessory_lr_model_parameters]
cols = X_gamingaccessory_dladd_test.columns
cols = cols.insert(0, "constant")
gamingaccessory_lr_coef = list(zip(cols, gamingaccessory_lr_model_parameters))
gamingaccessory_lr_coef
Out[362]:
[('constant', 0.0),
 ('gmv_lag1', -0.012),
 ('gmv_lag2', -0.019),
 ('gmv_lag3', -0.006),
 ('Discount%', 0.007),
 ('Discount%_lag1', -0.009),
 ('Discount%_lag2', 0.021),
 ('Discount%_lag3', -0.008),
 ('deliverybdays', 0.025),
 ('deliverybdays_lag1', -0.004),
 ('deliverybdays_lag2', -0.003),
 ('deliverybdays_lag3', 0.02),
 ('deliverycdays', 0.025),
 ('deliverycdays_lag1', -0.005),
 ('deliverycdays_lag2', -0.004),
 ('deliverycdays_lag3', 0.021),
 ('sla', 0.016),
 ('sla_lag1', -0.011),
 ('sla_lag2', 0.005),
 ('sla_lag3', -0.015),
 ('product_procurement_sla', 0.017),
 ('product_procurement_sla_lag1', -0.01),
 ('product_procurement_sla_lag2', 0.01),
 ('product_procurement_sla_lag3', 0.014),
 ('is_cod', 0.06),
 ('is_cod_lag1', 0.001),
 ('is_cod_lag2', -0.004),
 ('is_cod_lag3', 0.003),
 ('is_mass_market', 0.109),
 ('is_mass_market_lag1', -0.005),
 ('is_mass_market_lag2', -0.013),
 ('is_mass_market_lag3', -0.007),
 ('product_vertical_gamecontrolmount', 0.0),
 ('product_vertical_gamecontrolmount_lag1', 0.01),
 ('product_vertical_gamecontrolmount_lag2', -0.018),
 ('product_vertical_gamecontrolmount_lag3', -0.006),
 ('product_vertical_gamepad', 0.15),
 ('product_vertical_gamepad_lag1', -0.012),
 ('product_vertical_gamepad_lag2', -0.012),
 ('product_vertical_gamepad_lag3', -0.002),
 ('product_vertical_gamingaccessorykit', 0.115),
 ('product_vertical_gamingaccessorykit_lag1', -0.015),
 ('product_vertical_gamingaccessorykit_lag2', -0.026),
 ('product_vertical_gamingaccessorykit_lag3', 0.016),
 ('product_vertical_gamingadapter', 0.054),
 ('product_vertical_gamingadapter_lag1', -0.011),
 ('product_vertical_gamingadapter_lag2', 0.017),
 ('product_vertical_gamingadapter_lag3', -0.002),
 ('product_vertical_gamingchargingstation', 0.007),
 ('product_vertical_gamingchargingstation_lag1', -0.004),
 ('product_vertical_gamingchargingstation_lag2', -0.006),
 ('product_vertical_gamingchargingstation_lag3', -0.009),
 ('product_vertical_gamingheadset', 0.086),
 ('product_vertical_gamingheadset_lag1', -0.004),
 ('product_vertical_gamingheadset_lag2', -0.014),
 ('product_vertical_gamingheadset_lag3', -0.014),
 ('product_vertical_gamingkeyboard', 0.089),
 ('product_vertical_gamingkeyboard_lag1', -0.018),
 ('product_vertical_gamingkeyboard_lag2', 0.012),
 ('product_vertical_gamingkeyboard_lag3', 0.013),
 ('product_vertical_gamingmemorycard', 0.015),
 ('product_vertical_gamingmemorycard_lag1', -0.014),
 ('product_vertical_gamingmemorycard_lag2', 0.009),
 ('product_vertical_gamingmemorycard_lag3', 0.01),
 ('product_vertical_gamingmouse', 0.077),
 ('product_vertical_gamingmouse_lag1', 0.006),
 ('product_vertical_gamingmouse_lag2', -0.012),
 ('product_vertical_gamingmouse_lag3', -0.004),
 ('product_vertical_gamingmousepad', 0.076),
 ('product_vertical_gamingmousepad_lag1', -0.012),
 ('product_vertical_gamingmousepad_lag2', -0.025),
 ('product_vertical_gamingmousepad_lag3', 0.009),
 ('product_vertical_gamingspeaker', 0.077),
 ('product_vertical_gamingspeaker_lag1', -0.017),
 ('product_vertical_gamingspeaker_lag2', 0.022),
 ('product_vertical_gamingspeaker_lag3', -0.008),
 ('product_vertical_joystickgamingwheel', 0.071),
 ('product_vertical_joystickgamingwheel_lag1', -0.023),
 ('product_vertical_joystickgamingwheel_lag2', 0.007),
 ('product_vertical_joystickgamingwheel_lag3', -0.004),
 ('product_vertical_motioncontroller', 0.1),
 ('product_vertical_motioncontroller_lag1', 0.014),
 ('product_vertical_motioncontroller_lag2', -0.016),
 ('product_vertical_motioncontroller_lag3', -0.017),
 ('product_vertical_tvoutcableaccessory', 0.087),
 ('product_vertical_tvoutcableaccessory_lag1', 0.003),
 ('product_vertical_tvoutcableaccessory_lag2', 0.008),
 ('product_vertical_tvoutcableaccessory_lag3', -0.018),
 ('payday_week', -0.038),
 ('payday_week_lag1', -0.003),
 ('payday_week_lag2', 0.02),
 ('payday_week_lag3', 0.02),
 ('holiday_week', -0.0),
 ('holiday_week_lag1', 0.016),
 ('holiday_week_lag2', -0.02),
 ('holiday_week_lag3', 0.016),
 ('Total Investment', 0.005),
 ('Total Investment_lag1', 0.004),
 ('Total Investment_lag2', 0.002),
 ('Total Investment_lag3', 0.008),
 ('Total Investment_SMA_3', 0.004),
 ('Total Investment_SMA_3_lag1', 0.005),
 ('Total Investment_SMA_3_lag2', 0.0),
 ('Total Investment_SMA_3_lag3', -0.0),
 ('Total Investment_SMA_5', 0.002),
 ('Total Investment_SMA_5_lag1', 0.0),
 ('Total Investment_SMA_5_lag2', 0.002),
 ('Total Investment_SMA_5_lag3', -0.001),
 ('Total Investment_EMA_8', 0.003),
 ('Total Investment_EMA_8_lag1', 0.002),
 ('Total Investment_EMA_8_lag2', 0.001),
 ('Total Investment_EMA_8_lag3', 0.001),
 ('Total_Investment_Ad_Stock', 0.004),
 ('Total_Investment_Ad_Stock_lag1', 0.003),
 ('Total_Investment_Ad_Stock_lag2', 0.001),
 ('Total_Investment_Ad_Stock_lag3', 0.001),
 ('TV', 0.027),
 ('TV_lag1', 0.008),
 ('TV_lag2', -0.004),
 ('TV_lag3', -0.015),
 ('TV_SMA_3', 0.011),
 ('TV_SMA_3_lag1', -0.004),
 ('TV_SMA_3_lag2', 0.001),
 ('TV_SMA_3_lag3', -0.002),
 ('TV_SMA_5', 0.009),
 ('TV_SMA_5_lag1', -0.001),
 ('TV_SMA_5_lag2', 0.0),
 ('TV_SMA_5_lag3', 0.001),
 ('TV_EMA_8', 0.009),
 ('TV_EMA_8_lag1', 0.002),
 ('TV_EMA_8_lag2', -0.001),
 ('TV_EMA_8_lag3', 0.0),
 ('TV_Ad_Stock', 0.014),
 ('TV_Ad_Stock_lag1', 0.002),
 ('TV_Ad_Stock_lag2', -0.003),
 ('TV_Ad_Stock_lag3', -0.002),
 ('Digital', -0.015),
 ('Digital_lag1', 0.017),
 ('Digital_lag2', -0.004),
 ('Digital_lag3', -0.001),
 ('Digital_SMA_3', 0.0),
 ('Digital_SMA_3_lag1', 0.005),
 ('Digital_SMA_3_lag2', -0.009),
 ('Digital_SMA_3_lag3', -0.007),
 ('Digital_SMA_5', -0.007),
 ('Digital_SMA_5_lag1', -0.003),
 ('Digital_SMA_5_lag2', -0.003),
 ('Digital_SMA_5_lag3', -0.008),
 ('Digital_EMA_8', -0.005),
 ('Digital_EMA_8_lag1', 0.003),
 ('Digital_EMA_8_lag2', -0.005),
 ('Digital_EMA_8_lag3', -0.004),
 ('Digital_Ad_Stock', -0.004),
 ('Digital_Ad_Stock_lag1', 0.006),
 ('Digital_Ad_Stock_lag2', -0.006),
 ('Digital_Ad_Stock_lag3', -0.005),
 ('Sponsorship', 0.01),
 ('Sponsorship_lag1', -0.006),
 ('Sponsorship_lag2', 0.007),
 ('Sponsorship_lag3', 0.009),
 ('Sponsorship_SMA_3', 0.005),
 ('Sponsorship_SMA_3_lag1', 0.004),
 ('Sponsorship_SMA_3_lag2', 0.003),
 ('Sponsorship_SMA_3_lag3', 0.003),
 ('Sponsorship_SMA_5', 0.004),
 ('Sponsorship_SMA_5_lag1', 0.002),
 ('Sponsorship_SMA_5_lag2', 0.002),
 ('Sponsorship_SMA_5_lag3', -0.002),
 ('Sponsorship_EMA_8', 0.003),
 ('Sponsorship_EMA_8_lag1', 0.0),
 ('Sponsorship_EMA_8_lag2', 0.002),
 ('Sponsorship_EMA_8_lag3', -0.0),
 ('Sponsorship_Ad_Stock', 0.005),
 ('Sponsorship_Ad_Stock_lag1', 0.0),
 ('Sponsorship_Ad_Stock_lag2', 0.004),
 ('Sponsorship_Ad_Stock_lag3', 0.002),
 ('Content Marketing', -0.021),
 ('Content Marketing_lag1', 0.013),
 ('Content Marketing_lag2', -0.004),
 ('Content Marketing_lag3', 0.01),
 ('Content Marketing_SMA_3', -0.004),
 ('Content Marketing_SMA_3_lag1', 0.007),
 ('Content Marketing_SMA_3_lag2', -0.002),
 ('Content Marketing_SMA_3_lag3', 0.0),
 ('Content Marketing_SMA_5', -0.003),
 ('Content Marketing_SMA_5_lag1', 0.003),
 ('Content Marketing_SMA_5_lag2', -0.002),
 ('Content Marketing_SMA_5_lag3', -0.004),
 ('Content Marketing_EMA_8', -0.005),
 ('Content Marketing_EMA_8_lag1', 0.003),
 ('Content Marketing_EMA_8_lag2', -0.002),
 ('Content Marketing_EMA_8_lag3', -0.001),
 ('Content_Marketing_Ad_Stock', -0.007),
 ('Content_Marketing_Ad_Stock_lag1', 0.006),
 ('Content_Marketing_Ad_Stock_lag2', -0.002),
 ('Content_Marketing_Ad_Stock_lag3', 0.001),
 ('Online marketing', 0.005),
 ('Online marketing_lag1', 0.007),
 ('Online marketing_lag2', 0.004),
 ('Online marketing_lag3', 0.012),
 ('Online marketing_SMA_3', 0.005),
 ('Online marketing_SMA_3_lag1', 0.007),
 ('Online marketing_SMA_3_lag2', 0.0),
 ('Online marketing_SMA_3_lag3', 0.002),
 ('Online marketing_SMA_5', 0.002),
 ('Online marketing_SMA_5_lag1', 0.003),
 ('Online marketing_SMA_5_lag2', 0.003),
 ('Online marketing_SMA_5_lag3', 0.003),
 ('Online marketing_EMA_8', 0.005),
 ('Online marketing_EMA_8_lag1', 0.005),
 ('Online marketing_EMA_8_lag2', 0.004),
 ('Online marketing_EMA_8_lag3', 0.004),
 ('Online_marketing_Ad_Stock', 0.005),
 ('Online_marketing_Ad_Stock_lag1', 0.005),
 ('Online_marketing_Ad_Stock_lag2', 0.004),
 ('Online_marketing_Ad_Stock_lag3', 0.004),
 ('Affiliates', 0.012),
 ('Affiliates_lag1', 0.007),
 ('Affiliates_lag2', 0.004),
 ('Affiliates_lag3', 0.004),
 ('Affiliates_SMA_3', 0.008),
 ('Affiliates_SMA_3_lag1', 0.005),
 ('Affiliates_SMA_3_lag2', -0.002),
 ('Affiliates_SMA_3_lag3', -0.0),
 ('Affiliates_SMA_5', 0.002),
 ('Affiliates_SMA_5_lag1', 0.002),
 ('Affiliates_SMA_5_lag2', 0.003),
 ('Affiliates_SMA_5_lag3', 0.002),
 ('Affiliates_EMA_8', 0.006),
 ('Affiliates_EMA_8_lag1', 0.004),
 ('Affiliates_EMA_8_lag2', 0.003),
 ('Affiliates_EMA_8_lag3', 0.003),
 ('Affiliates_Ad_Stock', 0.008),
 ('Affiliates_Ad_Stock_lag1', 0.004),
 ('Affiliates_Ad_Stock_lag2', 0.002),
 ('Affiliates_Ad_Stock_lag3', 0.001),
 ('SEM', -0.013),
 ('SEM_lag1', 0.013),
 ('SEM_lag2', -0.003),
 ('SEM_lag3', 0.008),
 ('SEM_SMA_3', -0.0),
 ('SEM_SMA_3_lag1', 0.007),
 ('SEM_SMA_3_lag2', -0.007),
 ('SEM_SMA_3_lag3', -0.006),
 ('SEM_SMA_5', -0.006),
 ('SEM_SMA_5_lag1', -0.003),
 ('SEM_SMA_5_lag2', -0.002),
 ('SEM_SMA_5_lag3', -0.006),
 ('SEM_EMA_8', -0.004),
 ('SEM_EMA_8_lag1', 0.003),
 ('SEM_EMA_8_lag2', -0.003),
 ('SEM_EMA_8_lag3', -0.002),
 ('SEM_Ad_Stock', -0.004),
 ('SEM_Ad_Stock_lag1', 0.005),
 ('SEM_Ad_Stock_lag2', -0.003),
 ('SEM_Ad_Stock_lag3', -0.002),
 ('Radio', -0.008),
 ('Radio_lag1', 0.007),
 ('Radio_lag2', -0.01),
 ('Radio_lag3', 0.015),
 ('Radio_SMA_3', -0.004),
 ('Radio_SMA_3_lag1', 0.005),
 ('Radio_SMA_3_lag2', 0.001),
 ('Radio_SMA_3_lag3', -0.005),
 ('Radio_SMA_5', -0.002),
 ('Radio_SMA_5_lag1', -0.005),
 ('Radio_SMA_5_lag2', 0.0),
 ('Radio_SMA_5_lag3', 0.008),
 ('Radio_EMA_8', -0.002),
 ('Radio_EMA_8_lag1', 0.002),
 ('Radio_EMA_8_lag2', 0.001),
 ('Radio_EMA_8_lag3', 0.005),
 ('Radio_Ad_Stock', -0.003),
 ('Radio_Ad_Stock_lag1', 0.002),
 ('Radio_Ad_Stock_lag2', -0.002),
 ('Radio_Ad_Stock_lag3', 0.005),
 ('Other', -0.001),
 ('Other_lag1', 0.012),
 ('Other_lag2', -0.012),
 ('Other_lag3', -0.001),
 ('Other_SMA_3', -0.0),
 ('Other_SMA_3_lag1', -0.0),
 ('Other_SMA_3_lag2', 0.002),
 ('Other_SMA_3_lag3', -0.005),
 ('Other_SMA_5', 0.001),
 ('Other_SMA_5_lag1', -0.004),
 ('Other_SMA_5_lag2', -0.0),
 ('Other_SMA_5_lag3', 0.006),
 ('Other_EMA_8', 0.001),
 ('Other_EMA_8_lag1', 0.002),
 ('Other_EMA_8_lag2', -0.001),
 ('Other_EMA_8_lag3', 0.003),
 ('Other_Ad_Stock', 0.001),
 ('Other_Ad_Stock_lag1', 0.003),
 ('Other_Ad_Stock_lag2', -0.004),
 ('Other_Ad_Stock_lag3', 0.002),
 ('NPS', -0.002),
 ('NPS_lag1', 0.006),
 ('NPS_lag2', 0.006),
 ('NPS_lag3', 0.004),
 ('NPS_SMA_3', 0.008),
 ('NPS_SMA_3_lag1', 0.004),
 ('NPS_SMA_3_lag2', -0.018),
 ('NPS_SMA_3_lag3', -0.013),
 ('NPS_SMA_5', -0.018),
 ('NPS_SMA_5_lag1', -0.014),
 ('NPS_SMA_5_lag2', 0.006),
 ('NPS_SMA_5_lag3', 0.004),
 ('Stock Index', -0.012),
 ('Stock Index_lag1', 0.009),
 ('Stock Index_lag2', 0.01),
 ('Stock Index_lag3', 0.01),
 ('Stock Index_SMA_3', 0.008),
 ('Stock Index_SMA_3_lag1', 0.008),
 ('Stock Index_SMA_3_lag2', -0.018),
 ('Stock Index_SMA_3_lag3', -0.012),
 ('Stock Index_SMA_5', -0.018),
 ('Stock Index_SMA_5_lag1', -0.012),
 ('Stock Index_SMA_5_lag2', 0.006),
 ('Stock Index_SMA_5_lag3', 0.004),
 ('Max Temp', 0.007),
 ('Max Temp_lag1', -0.008),
 ('Max Temp_lag2', 0.023),
 ('Max Temp_lag3', -0.014),
 ('Min Temp', -0.006),
 ('Min Temp_lag1', 0.023),
 ('Min Temp_lag2', -0.009),
 ('Min Temp_lag3', -0.017),
 ('Mean Temp', 0.005),
 ('Mean Temp_lag1', 0.007),
 ('Mean Temp_lag2', 0.002),
 ('Mean Temp_lag3', -0.005),
 ('Heat Deg Days', 0.009),
 ('Heat Deg Days_lag1', -0.007),
 ('Heat Deg Days_lag2', -0.003),
 ('Heat Deg Days_lag3', 0.01),
 ('Cool Deg Days', 0.047),
 ('Cool Deg Days_lag1', -0.004),
 ('Cool Deg Days_lag2', -0.016),
 ('Cool Deg Days_lag3', -0.0),
 ('Total Rain (mm)', -0.005),
 ('Total Rain (mm)_lag1', 0.014),
 ('Total Rain (mm)_lag2', 0.015),
 ('Total Rain (mm)_lag3', 0.002),
 ('Total Snow (cm)', 0.016),
 ('Total Snow (cm)_lag1', 0.032),
 ('Total Snow (cm)_lag2', 0.003),
 ('Total Snow (cm)_lag3', -0.036),
 ('Total Precip (mm)', -0.001),
 ('Total Precip (mm)_lag1', 0.02),
 ('Total Precip (mm)_lag2', 0.015),
 ('Total Precip (mm)_lag3', -0.005),
 ('Snow on Grnd (cm)', 0.019),
 ('Snow on Grnd (cm)_lag1', 0.009),
 ('Snow on Grnd (cm)_lag2', -0.004),
 ('Snow on Grnd (cm)_lag3', -0.005),
 ('Sale', 0.019),
 ('Sale_lag1', -0.012),
 ('Sale_lag2', -0.01),
 ('Sale_lag3', 0.002)]
In [363]:
gamingaccessory_lr_coef_df = pd.DataFrame(gamingaccessory_lr_coef)
col_rename = {0:'Features',1: 'Coefficients'}
gamingaccessory_lr_coef_df = gamingaccessory_lr_coef_df.rename(columns=col_rename)
gamingaccessory_lr_coef_df = gamingaccessory_lr_coef_df.iloc[1:,:]
gamingaccessory_lr_coef_df = gamingaccessory_lr_coef_df.loc[gamingaccessory_lr_coef_df['Coefficients']!=0.0]
gamingaccessory_lr_coef_df = gamingaccessory_lr_coef_df.sort_values(by=['Coefficients'], ascending = False)
gamingaccessory_lr_coef_df
Out[363]:
Features Coefficients
36 product_vertical_gamepad 0.150
40 product_vertical_gamingaccessorykit 0.115
28 is_mass_market 0.109
80 product_vertical_motioncontroller 0.100
56 product_vertical_gamingkeyboard 0.089
84 product_vertical_tvoutcableaccessory 0.087
52 product_vertical_gamingheadset 0.086
72 product_vertical_gamingspeaker 0.077
64 product_vertical_gamingmouse 0.077
68 product_vertical_gamingmousepad 0.076
76 product_vertical_joystickgamingwheel 0.071
24 is_cod 0.060
44 product_vertical_gamingadapter 0.054
336 Cool Deg Days 0.047
345 Total Snow (cm)_lag1 0.032
116 TV 0.027
8 deliverybdays 0.025
12 deliverycdays 0.025
322 Max Temp_lag2 0.023
325 Min Temp_lag1 0.023
74 product_vertical_gamingspeaker_lag2 0.022
6 Discount%_lag2 0.021
15 deliverycdays_lag3 0.021
91 payday_week_lag3 0.020
90 payday_week_lag2 0.020
349 Total Precip (mm)_lag1 0.020
11 deliverybdays_lag3 0.020
352 Snow on Grnd (cm) 0.019
356 Sale 0.019
20 product_procurement_sla 0.017
137 Digital_lag1 0.017
46 product_vertical_gamingadapter_lag2 0.017
93 holiday_week_lag1 0.016
95 holiday_week_lag3 0.016
43 product_vertical_gamingaccessorykit_lag3 0.016
16 sla 0.016
344 Total Snow (cm) 0.016
60 product_vertical_gamingmemorycard 0.015
350 Total Precip (mm)_lag2 0.015
342 Total Rain (mm)_lag2 0.015
259 Radio_lag3 0.015
132 TV_Ad_Stock 0.014
23 product_procurement_sla_lag3 0.014
81 product_vertical_motioncontroller_lag1 0.014
341 Total Rain (mm)_lag1 0.014
237 SEM_lag1 0.013
59 product_vertical_gamingkeyboard_lag3 0.013
177 Content Marketing_lag1 0.013
216 Affiliates 0.012
199 Online marketing_lag3 0.012
58 product_vertical_gamingkeyboard_lag2 0.012
277 Other_lag1 0.012
120 TV_SMA_3 0.011
335 Heat Deg Days_lag3 0.010
22 product_procurement_sla_lag2 0.010
33 product_vertical_gamecontrolmount_lag1 0.010
179 Content Marketing_lag3 0.010
63 product_vertical_gamingmemorycard_lag3 0.010
311 Stock Index_lag3 0.010
310 Stock Index_lag2 0.010
156 Sponsorship 0.010
159 Sponsorship_lag3 0.009
128 TV_EMA_8 0.009
62 product_vertical_gamingmemorycard_lag2 0.009
309 Stock Index_lag1 0.009
332 Heat Deg Days 0.009
124 TV_SMA_5 0.009
353 Snow on Grnd (cm)_lag1 0.009
71 product_vertical_gamingmousepad_lag3 0.009
86 product_vertical_tvoutcableaccessory_lag2 0.008
220 Affiliates_SMA_3 0.008
267 Radio_SMA_5_lag3 0.008
117 TV_lag1 0.008
300 NPS_SMA_3 0.008
232 Affiliates_Ad_Stock 0.008
312 Stock Index_SMA_3 0.008
313 Stock Index_SMA_3_lag1 0.008
99 Total Investment_lag3 0.008
239 SEM_lag3 0.008
201 Online marketing_SMA_3_lag1 0.007
158 Sponsorship_lag2 0.007
257 Radio_lag1 0.007
241 SEM_SMA_3_lag1 0.007
181 Content Marketing_SMA_3_lag1 0.007
78 product_vertical_joystickgamingwheel_lag2 0.007
48 product_vertical_gamingchargingstation 0.007
4 Discount% 0.007
197 Online marketing_lag1 0.007
329 Mean Temp_lag1 0.007
217 Affiliates_lag1 0.007
320 Max Temp 0.007
193 Content_Marketing_Ad_Stock_lag1 0.006
153 Digital_Ad_Stock_lag1 0.006
287 Other_SMA_5_lag3 0.006
318 Stock Index_SMA_5_lag2 0.006
228 Affiliates_EMA_8 0.006
306 NPS_SMA_5_lag2 0.006
65 product_vertical_gamingmouse_lag1 0.006
298 NPS_lag2 0.006
297 NPS_lag1 0.006
213 Online_marketing_Ad_Stock_lag1 0.005
160 Sponsorship_SMA_3 0.005
212 Online_marketing_Ad_Stock 0.005
209 Online marketing_EMA_8_lag1 0.005
221 Affiliates_SMA_3_lag1 0.005
141 Digital_SMA_3_lag1 0.005
208 Online marketing_EMA_8 0.005
196 Online marketing 0.005
261 Radio_SMA_3_lag1 0.005
275 Radio_Ad_Stock_lag3 0.005
328 Mean Temp 0.005
200 Online marketing_SMA_3 0.005
271 Radio_EMA_8_lag3 0.005
96 Total Investment 0.005
172 Sponsorship_Ad_Stock 0.005
101 Total Investment_SMA_3_lag1 0.005
253 SEM_Ad_Stock_lag1 0.005
18 sla_lag2 0.005
301 NPS_SMA_3_lag1 0.004
211 Online marketing_EMA_8_lag3 0.004
218 Affiliates_lag2 0.004
307 NPS_SMA_5_lag3 0.004
215 Online_marketing_Ad_Stock_lag3 0.004
198 Online marketing_lag2 0.004
214 Online_marketing_Ad_Stock_lag2 0.004
219 Affiliates_lag3 0.004
299 NPS_lag3 0.004
174 Sponsorship_Ad_Stock_lag2 0.004
112 Total_Investment_Ad_Stock 0.004
210 Online marketing_EMA_8_lag2 0.004
97 Total Investment_lag1 0.004
229 Affiliates_EMA_8_lag1 0.004
100 Total Investment_SMA_3 0.004
233 Affiliates_Ad_Stock_lag1 0.004
164 Sponsorship_SMA_5 0.004
161 Sponsorship_SMA_3_lag1 0.004
319 Stock Index_SMA_5_lag3 0.004
162 Sponsorship_SMA_3_lag2 0.003
207 Online marketing_SMA_5_lag3 0.003
205 Online marketing_SMA_5_lag1 0.003
168 Sponsorship_EMA_8 0.003
163 Sponsorship_SMA_3_lag3 0.003
206 Online marketing_SMA_5_lag2 0.003
189 Content Marketing_EMA_8_lag1 0.003
149 Digital_EMA_8_lag1 0.003
113 Total_Investment_Ad_Stock_lag1 0.003
85 product_vertical_tvoutcableaccessory_lag1 0.003
185 Content Marketing_SMA_5_lag1 0.003
293 Other_Ad_Stock_lag1 0.003
108 Total Investment_EMA_8 0.003
249 SEM_EMA_8_lag1 0.003
291 Other_EMA_8_lag3 0.003
231 Affiliates_EMA_8_lag3 0.003
230 Affiliates_EMA_8_lag2 0.003
27 is_cod_lag3 0.003
226 Affiliates_SMA_5_lag2 0.003
346 Total Snow (cm)_lag2 0.003
289 Other_EMA_8_lag1 0.002
295 Other_Ad_Stock_lag3 0.002
343 Total Rain (mm)_lag3 0.002
330 Mean Temp_lag2 0.002
269 Radio_EMA_8_lag1 0.002
282 Other_SMA_3_lag2 0.002
224 Affiliates_SMA_5 0.002
203 Online marketing_SMA_3_lag3 0.002
234 Affiliates_Ad_Stock_lag2 0.002
227 Affiliates_SMA_5_lag3 0.002
204 Online marketing_SMA_5 0.002
225 Affiliates_SMA_5_lag1 0.002
273 Radio_Ad_Stock_lag1 0.002
359 Sale_lag3 0.002
165 Sponsorship_SMA_5_lag1 0.002
175 Sponsorship_Ad_Stock_lag3 0.002
133 TV_Ad_Stock_lag1 0.002
104 Total Investment_SMA_5 0.002
98 Total Investment_lag2 0.002
129 TV_EMA_8_lag1 0.002
166 Sponsorship_SMA_5_lag2 0.002
170 Sponsorship_EMA_8_lag2 0.002
106 Total Investment_SMA_5_lag2 0.002
109 Total Investment_EMA_8_lag1 0.002
270 Radio_EMA_8_lag2 0.001
25 is_cod_lag1 0.001
195 Content_Marketing_Ad_Stock_lag3 0.001
122 TV_SMA_3_lag2 0.001
235 Affiliates_Ad_Stock_lag3 0.001
110 Total Investment_EMA_8_lag2 0.001
127 TV_SMA_5_lag3 0.001
115 Total_Investment_Ad_Stock_lag3 0.001
111 Total Investment_EMA_8_lag3 0.001
292 Other_Ad_Stock 0.001
262 Radio_SMA_3_lag2 0.001
288 Other_EMA_8 0.001
284 Other_SMA_5 0.001
114 Total_Investment_Ad_Stock_lag2 0.001
107 Total Investment_SMA_5_lag3 -0.001
130 TV_EMA_8_lag2 -0.001
348 Total Precip (mm) -0.001
139 Digital_lag3 -0.001
279 Other_lag3 -0.001
276 Other -0.001
290 Other_EMA_8_lag2 -0.001
125 TV_SMA_5_lag1 -0.001
191 Content Marketing_EMA_8_lag3 -0.001
255 SEM_Ad_Stock_lag3 -0.002
222 Affiliates_SMA_3_lag2 -0.002
123 TV_SMA_3_lag3 -0.002
246 SEM_SMA_5_lag2 -0.002
274 Radio_Ad_Stock_lag2 -0.002
135 TV_Ad_Stock_lag3 -0.002
264 Radio_SMA_5 -0.002
268 Radio_EMA_8 -0.002
182 Content Marketing_SMA_3_lag2 -0.002
167 Sponsorship_SMA_5_lag3 -0.002
296 NPS -0.002
47 product_vertical_gamingadapter_lag3 -0.002
194 Content_Marketing_Ad_Stock_lag2 -0.002
39 product_vertical_gamepad_lag3 -0.002
190 Content Marketing_EMA_8_lag2 -0.002
186 Content Marketing_SMA_5_lag2 -0.002
251 SEM_EMA_8_lag3 -0.002
250 SEM_EMA_8_lag2 -0.003
184 Content Marketing_SMA_5 -0.003
254 SEM_Ad_Stock_lag2 -0.003
245 SEM_SMA_5_lag1 -0.003
334 Heat Deg Days_lag2 -0.003
10 deliverybdays_lag2 -0.003
89 payday_week_lag1 -0.003
272 Radio_Ad_Stock -0.003
238 SEM_lag2 -0.003
146 Digital_SMA_5_lag2 -0.003
145 Digital_SMA_5_lag1 -0.003
134 TV_Ad_Stock_lag2 -0.003
260 Radio_SMA_3 -0.004
138 Digital_lag2 -0.004
9 deliverybdays_lag1 -0.004
337 Cool Deg Days_lag1 -0.004
354 Snow on Grnd (cm)_lag2 -0.004
26 is_cod_lag2 -0.004
151 Digital_EMA_8_lag3 -0.004
79 product_vertical_joystickgamingwheel_lag3 -0.004
180 Content Marketing_SMA_3 -0.004
152 Digital_Ad_Stock -0.004
285 Other_SMA_5_lag1 -0.004
67 product_vertical_gamingmouse_lag3 -0.004
187 Content Marketing_SMA_5_lag3 -0.004
53 product_vertical_gamingheadset_lag1 -0.004
14 deliverycdays_lag2 -0.004
121 TV_SMA_3_lag1 -0.004
49 product_vertical_gamingchargingstation_lag1 -0.004
118 TV_lag2 -0.004
252 SEM_Ad_Stock -0.004
178 Content Marketing_lag2 -0.004
294 Other_Ad_Stock_lag2 -0.004
248 SEM_EMA_8 -0.004
150 Digital_EMA_8_lag2 -0.005
155 Digital_Ad_Stock_lag3 -0.005
351 Total Precip (mm)_lag3 -0.005
283 Other_SMA_3_lag3 -0.005
355 Snow on Grnd (cm)_lag3 -0.005
148 Digital_EMA_8 -0.005
340 Total Rain (mm) -0.005
13 deliverycdays_lag1 -0.005
265 Radio_SMA_5_lag1 -0.005
29 is_mass_market_lag1 -0.005
263 Radio_SMA_3_lag3 -0.005
331 Mean Temp_lag3 -0.005
188 Content Marketing_EMA_8 -0.005
243 SEM_SMA_3_lag3 -0.006
50 product_vertical_gamingchargingstation_lag2 -0.006
35 product_vertical_gamecontrolmount_lag3 -0.006
324 Min Temp -0.006
244 SEM_SMA_5 -0.006
157 Sponsorship_lag1 -0.006
247 SEM_SMA_5_lag3 -0.006
3 gmv_lag3 -0.006
154 Digital_Ad_Stock_lag2 -0.006
31 is_mass_market_lag3 -0.007
242 SEM_SMA_3_lag2 -0.007
144 Digital_SMA_5 -0.007
143 Digital_SMA_3_lag3 -0.007
192 Content_Marketing_Ad_Stock -0.007
333 Heat Deg Days_lag1 -0.007
75 product_vertical_gamingspeaker_lag3 -0.008
256 Radio -0.008
7 Discount%_lag3 -0.008
321 Max Temp_lag1 -0.008
147 Digital_SMA_5_lag3 -0.008
142 Digital_SMA_3_lag2 -0.009
51 product_vertical_gamingchargingstation_lag3 -0.009
5 Discount%_lag1 -0.009
326 Min Temp_lag2 -0.009
258 Radio_lag2 -0.010
358 Sale_lag2 -0.010
21 product_procurement_sla_lag1 -0.010
45 product_vertical_gamingadapter_lag1 -0.011
17 sla_lag1 -0.011
357 Sale_lag1 -0.012
37 product_vertical_gamepad_lag1 -0.012
38 product_vertical_gamepad_lag2 -0.012
1 gmv_lag1 -0.012
69 product_vertical_gamingmousepad_lag1 -0.012
278 Other_lag2 -0.012
317 Stock Index_SMA_5_lag1 -0.012
315 Stock Index_SMA_3_lag3 -0.012
66 product_vertical_gamingmouse_lag2 -0.012
308 Stock Index -0.012
30 is_mass_market_lag2 -0.013
236 SEM -0.013
303 NPS_SMA_3_lag3 -0.013
61 product_vertical_gamingmemorycard_lag1 -0.014
55 product_vertical_gamingheadset_lag3 -0.014
305 NPS_SMA_5_lag1 -0.014
54 product_vertical_gamingheadset_lag2 -0.014
323 Max Temp_lag3 -0.014
136 Digital -0.015
119 TV_lag3 -0.015
19 sla_lag3 -0.015
41 product_vertical_gamingaccessorykit_lag1 -0.015
338 Cool Deg Days_lag2 -0.016
82 product_vertical_motioncontroller_lag2 -0.016
73 product_vertical_gamingspeaker_lag1 -0.017
83 product_vertical_motioncontroller_lag3 -0.017
327 Min Temp_lag3 -0.017
87 product_vertical_tvoutcableaccessory_lag3 -0.018
304 NPS_SMA_5 -0.018
57 product_vertical_gamingkeyboard_lag1 -0.018
302 NPS_SMA_3_lag2 -0.018
34 product_vertical_gamecontrolmount_lag2 -0.018
314 Stock Index_SMA_3_lag2 -0.018
316 Stock Index_SMA_5 -0.018
2 gmv_lag2 -0.019
94 holiday_week_lag2 -0.020
176 Content Marketing -0.021
77 product_vertical_joystickgamingwheel_lag1 -0.023
70 product_vertical_gamingmousepad_lag2 -0.025
42 product_vertical_gamingaccessorykit_lag2 -0.026
347 Total Snow (cm)_lag3 -0.036
88 payday_week -0.038

Plotting the Features in descending order of Importance for gamingaccessory

In [364]:
# Slightly alter the figure size to make it more horizontal.
plt.figure(figsize=(10, 35), dpi=100, facecolor='w', edgecolor='k', frameon='True')
sns.barplot(y='Features', x='Coefficients', palette='husl', data=gamingaccessory_lr_coef_df, estimator=np.sum)
# Automatically adjust subplot params so that the subplotS fits in to the figure area.
plt.tight_layout()

# display the plot
plt.show()
The 5 most important features affecting GMV(Revenue) for gamingaccessory are:
Features Coefficients
product_vertical_gamepad 0.150
product_vertical_gamingaccessorykit 0.115
is_mass_market 0.109
product_vertical_motioncontroller 0.100
product_vertical_gamingkeyboard 0.089

Building Linear Regression model for homeaudio

In [365]:
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
from sklearn.metrics import mean_squared_error

homeaudio_dladd_model = LinearRegression().fit(X_homeaudio_dladd_train, y_homeaudio_dladd_train)
y_homeaudio_dladd_test_pred = homeaudio_dladd_model.predict(X_homeaudio_dladd_test)

print('R2 Score: {}'.format(r2_score(y_homeaudio_dladd_test, y_homeaudio_dladd_test_pred)))
print('Mean Squared Error: {}'.format(mean_squared_error(y_homeaudio_dladd_test, y_homeaudio_dladd_test_pred)))
R2 Score: 0.42170843539767855
Mean Squared Error: 1.3866075828708573
With Simple Linear Regression, we get a r2 score of 0.42 and mse of 1.39

Building Linear Regression model for homeaudio using K-fold Cross Validation

We will use GridSearchCV method and 5 fold cross validation method for our linear regression.

In [366]:
y_homeaudio_dladd = homeaudio_dladd_df.pop('gmv')
X_homeaudio_dladd = homeaudio_dladd_df
In [367]:
# Make cross validated predictions
from sklearn.model_selection import cross_val_score,cross_val_predict
from sklearn import metrics

homeaudio_dladd_model_cv = LinearRegression().fit(X_homeaudio_dladd, y_homeaudio_dladd)
homeaudio_dladd_predictions_cv = cross_val_predict(homeaudio_dladd_model_cv, X_homeaudio_dladd, y_homeaudio_dladd, cv=10)
accuracy = metrics.r2_score(y_homeaudio_dladd, homeaudio_dladd_predictions_cv)
print("Cross-Predicted Accuracy:", accuracy)
print('Mean Squared Error: {}'.format(mean_squared_error(y_homeaudio_dladd, homeaudio_dladd_predictions_cv)))
Cross-Predicted Accuracy: 0.553216298571438
Mean Squared Error: 0.4467837014285619
With Simple Linear Regression, using cross validation, we get r2 score of 0.55 and mse score of 0.45

Determining Feature Importance for homeaudio

In [368]:
# linear regression model parameters
#Limiting floats output to 3 decimal points
pd.set_option('display.float_format', lambda x: '{:.3f}'.format(x)) 
pd.set_option('display.precision',1)


homeaudio_lr_model_parameters = list(homeaudio_dladd_model_cv.coef_)
homeaudio_lr_model_parameters.insert(0, homeaudio_dladd_model_cv.intercept_)
homeaudio_lr_model_parameters = [round(x, 3) for x in homeaudio_lr_model_parameters]
cols = homeaudio_dladd_test.columns
cols = cols.insert(0, "constant")
homeaudio_lr_coef = list(zip(cols, homeaudio_lr_model_parameters))
homeaudio_lr_coef
Out[368]:
[('constant', -0.0),
 ('gmv_lag1', -0.031),
 ('gmv_lag2', -0.015),
 ('gmv_lag3', -0.002),
 ('Discount%', 0.073),
 ('Discount%_lag1', -0.007),
 ('Discount%_lag2', -0.004),
 ('Discount%_lag3', 0.027),
 ('deliverybdays', 0.002),
 ('deliverybdays_lag1', -0.005),
 ('deliverybdays_lag2', 0.002),
 ('deliverybdays_lag3', -0.01),
 ('deliverycdays', 0.002),
 ('deliverycdays_lag1', -0.004),
 ('deliverycdays_lag2', 0.002),
 ('deliverycdays_lag3', -0.01),
 ('sla', -0.018),
 ('sla_lag1', -0.003),
 ('sla_lag2', 0.025),
 ('sla_lag3', -0.02),
 ('product_procurement_sla', -0.04),
 ('product_procurement_sla_lag1', 0.022),
 ('product_procurement_sla_lag2', -0.003),
 ('product_procurement_sla_lag3', 0.023),
 ('is_cod', 0.145),
 ('is_cod_lag1', -0.017),
 ('is_cod_lag2', -0.002),
 ('is_cod_lag3', 0.004),
 ('is_mass_market', 0.162),
 ('is_mass_market_lag1', -0.02),
 ('is_mass_market_lag2', -0.011),
 ('is_mass_market_lag3', -0.005),
 ('product_vertical_djcontroller', 0.027),
 ('product_vertical_djcontroller_lag1', 0.032),
 ('product_vertical_djcontroller_lag2', -0.036),
 ('product_vertical_djcontroller_lag3', -0.035),
 ('product_vertical_dock', 0.067),
 ('product_vertical_dock_lag1', 0.015),
 ('product_vertical_dock_lag2', -0.012),
 ('product_vertical_dock_lag3', -0.008),
 ('product_vertical_dockingstation', 0.02),
 ('product_vertical_dockingstation_lag1', 0.001),
 ('product_vertical_dockingstation_lag2', -0.033),
 ('product_vertical_dockingstation_lag3', 0.035),
 ('product_vertical_fmradio', 0.094),
 ('product_vertical_fmradio_lag1', 0.007),
 ('product_vertical_fmradio_lag2', -0.007),
 ('product_vertical_fmradio_lag3', 0.002),
 ('product_vertical_hifisystem', 0.066),
 ('product_vertical_hifisystem_lag1', -0.032),
 ('product_vertical_hifisystem_lag2', 0.017),
 ('product_vertical_hifisystem_lag3', -0.033),
 ('product_vertical_homeaudiospeaker', 0.184),
 ('product_vertical_homeaudiospeaker_lag1', -0.033),
 ('product_vertical_homeaudiospeaker_lag2', -0.011),
 ('product_vertical_homeaudiospeaker_lag3', -0.001),
 ('product_vertical_karaokeplayer', 0.173),
 ('product_vertical_karaokeplayer_lag1', -0.019),
 ('product_vertical_karaokeplayer_lag2', -0.027),
 ('product_vertical_karaokeplayer_lag3', -0.03),
 ('product_vertical_slingbox', -0.015),
 ('product_vertical_slingbox_lag1', 0.011),
 ('product_vertical_slingbox_lag2', -0.003),
 ('product_vertical_slingbox_lag3', -0.004),
 ('product_vertical_soundmixer', 0.022),
 ('product_vertical_soundmixer_lag1', 0.009),
 ('product_vertical_soundmixer_lag2', -0.017),
 ('product_vertical_soundmixer_lag3', -0.004),
 ('product_vertical_voicerecorder', 0.032),
 ('product_vertical_voicerecorder_lag1', 0.032),
 ('product_vertical_voicerecorder_lag2', -0.004),
 ('product_vertical_voicerecorder_lag3', -0.011),
 ('payday_week', -0.015),
 ('payday_week_lag1', -0.006),
 ('payday_week_lag2', -0.003),
 ('payday_week_lag3', 0.006),
 ('holiday_week', -0.024),
 ('holiday_week_lag1', 0.004),
 ('holiday_week_lag2', -0.006),
 ('holiday_week_lag3', -0.035),
 ('Total Investment', -0.001),
 ('Total Investment_lag1', 0.004),
 ('Total Investment_lag2', 0.0),
 ('Total Investment_lag3', -0.0),
 ('Total Investment_SMA_3', 0.002),
 ('Total Investment_SMA_3_lag1', 0.002),
 ('Total Investment_SMA_3_lag2', -0.001),
 ('Total Investment_SMA_3_lag3', 0.0),
 ('Total Investment_SMA_5', 0.0),
 ('Total Investment_SMA_5_lag1', 0.001),
 ('Total Investment_SMA_5_lag2', 0.003),
 ('Total Investment_SMA_5_lag3', -0.004),
 ('Total Investment_EMA_8', 0.004),
 ('Total Investment_EMA_8_lag1', 0.005),
 ('Total Investment_EMA_8_lag2', 0.005),
 ('Total Investment_EMA_8_lag3', 0.006),
 ('Total_Investment_Ad_Stock', 0.002),
 ('Total_Investment_Ad_Stock_lag1', 0.003),
 ('Total_Investment_Ad_Stock_lag2', 0.001),
 ('Total_Investment_Ad_Stock_lag3', 0.002),
 ('TV', 0.001),
 ('TV_lag1', 0.008),
 ('TV_lag2', -0.024),
 ('TV_lag3', 0.009),
 ('TV_SMA_3', -0.005),
 ('TV_SMA_3_lag1', -0.002),
 ('TV_SMA_3_lag2', -0.007),
 ('TV_SMA_3_lag3', -0.001),
 ('TV_SMA_5', -0.003),
 ('TV_SMA_5_lag1', -0.005),
 ('TV_SMA_5_lag2', -0.009),
 ('TV_SMA_5_lag3', -0.005),
 ('TV_EMA_8', -0.002),
 ('TV_EMA_8_lag1', -0.003),
 ('TV_EMA_8_lag2', -0.006),
 ('TV_EMA_8_lag3', 0.001),
 ('TV_Ad_Stock', -0.001),
 ('TV_Ad_Stock_lag1', -0.003),
 ('TV_Ad_Stock_lag2', -0.011),
 ('TV_Ad_Stock_lag3', 0.001),
 ('Digital', -0.015),
 ('Digital_lag1', 0.049),
 ('Digital_lag2', -0.009),
 ('Digital_lag3', -0.016),
 ('Digital_SMA_3', 0.011),
 ('Digital_SMA_3_lag1', 0.01),
 ('Digital_SMA_3_lag2', -0.008),
 ('Digital_SMA_3_lag3', -0.013),
 ('Digital_SMA_5', 0.003),
 ('Digital_SMA_5_lag1', 0.001),
 ('Digital_SMA_5_lag2', -0.005),
 ('Digital_SMA_5_lag3', -0.014),
 ('Digital_EMA_8', 0.008),
 ('Digital_EMA_8_lag1', 0.016),
 ('Digital_EMA_8_lag2', -0.001),
 ('Digital_EMA_8_lag3', 0.003),
 ('Digital_Ad_Stock', 0.005),
 ('Digital_Ad_Stock_lag1', 0.022),
 ('Digital_Ad_Stock_lag2', -0.009),
 ('Digital_Ad_Stock_lag3', -0.007),
 ('Sponsorship', 0.007),
 ('Sponsorship_lag1', -0.016),
 ('Sponsorship_lag2', 0.009),
 ('Sponsorship_lag3', -0.004),
 ('Sponsorship_SMA_3', 0.001),
 ('Sponsorship_SMA_3_lag1', -0.004),
 ('Sponsorship_SMA_3_lag2', 0.0),
 ('Sponsorship_SMA_3_lag3', 0.002),
 ('Sponsorship_SMA_5', -0.002),
 ('Sponsorship_SMA_5_lag1', -0.001),
 ('Sponsorship_SMA_5_lag2', 0.006),
 ('Sponsorship_SMA_5_lag3', -0.007),
 ('Sponsorship_EMA_8', 0.003),
 ('Sponsorship_EMA_8_lag1', -0.0),
 ('Sponsorship_EMA_8_lag2', 0.008),
 ('Sponsorship_EMA_8_lag3', 0.006),
 ('Sponsorship_Ad_Stock', 0.001),
 ('Sponsorship_Ad_Stock_lag1', -0.005),
 ('Sponsorship_Ad_Stock_lag2', 0.005),
 ('Sponsorship_Ad_Stock_lag3', 0.0),
 ('Content Marketing', -0.013),
 ('Content Marketing_lag1', 0.033),
 ('Content Marketing_lag2', 0.001),
 ('Content Marketing_lag3', -0.014),
 ('Content Marketing_SMA_3', 0.008),
 ('Content Marketing_SMA_3_lag1', 0.007),
 ('Content Marketing_SMA_3_lag2', -0.001),
 ('Content Marketing_SMA_3_lag3', -0.004),
 ('Content Marketing_SMA_5', 0.005),
 ('Content Marketing_SMA_5_lag1', 0.006),
 ('Content Marketing_SMA_5_lag2', 0.004),
 ('Content Marketing_SMA_5_lag3', -0.005),
 ('Content Marketing_EMA_8', 0.006),
 ('Content Marketing_EMA_8_lag1', 0.015),
 ('Content Marketing_EMA_8_lag2', 0.003),
 ('Content Marketing_EMA_8_lag3', 0.003),
 ('Content_Marketing_Ad_Stock', 0.003),
 ('Content_Marketing_Ad_Stock_lag1', 0.017),
 ('Content_Marketing_Ad_Stock_lag2', -0.001),
 ('Content_Marketing_Ad_Stock_lag3', -0.002),
 ('Online marketing', -0.005),
 ('Online marketing_lag1', -0.006),
 ('Online marketing_lag2', 0.009),
 ('Online marketing_lag3', 0.012),
 ('Online marketing_SMA_3', -0.0),
 ('Online marketing_SMA_3_lag1', 0.006),
 ('Online marketing_SMA_3_lag2', 0.005),
 ('Online marketing_SMA_3_lag3', 0.012),
 ('Online marketing_SMA_5', 0.001),
 ('Online marketing_SMA_5_lag1', 0.008),
 ('Online marketing_SMA_5_lag2', 0.009),
 ('Online marketing_SMA_5_lag3', 0.005),
 ('Online marketing_EMA_8', 0.003),
 ('Online marketing_EMA_8_lag1', 0.004),
 ('Online marketing_EMA_8_lag2', 0.007),
 ('Online marketing_EMA_8_lag3', 0.006),
 ('Online_marketing_Ad_Stock', -0.0),
 ('Online_marketing_Ad_Stock_lag1', 0.003),
 ('Online_marketing_Ad_Stock_lag2', 0.009),
 ('Online_marketing_Ad_Stock_lag3', 0.008),
 ('Affiliates', -0.005),
 ('Affiliates_lag1', -0.005),
 ('Affiliates_lag2', 0.006),
 ('Affiliates_lag3', 0.013),
 ('Affiliates_SMA_3', -0.001),
 ('Affiliates_SMA_3_lag1', 0.005),
 ('Affiliates_SMA_3_lag2', 0.003),
 ('Affiliates_SMA_3_lag3', 0.011),
 ('Affiliates_SMA_5', -0.0),
 ('Affiliates_SMA_5_lag1', 0.007),
 ('Affiliates_SMA_5_lag2', 0.006),
 ('Affiliates_SMA_5_lag3', 0.004),
 ('Affiliates_EMA_8', 0.002),
 ('Affiliates_EMA_8_lag1', 0.004),
 ('Affiliates_EMA_8_lag2', 0.006),
 ('Affiliates_EMA_8_lag3', 0.006),
 ('Affiliates_Ad_Stock', -0.0),
 ('Affiliates_Ad_Stock_lag1', 0.002),
 ('Affiliates_Ad_Stock_lag2', 0.007),
 ('Affiliates_Ad_Stock_lag3', 0.007),
 ('SEM', -0.012),
 ('SEM_lag1', 0.037),
 ('SEM_lag2', -0.003),
 ('SEM_lag3', -0.012),
 ('SEM_SMA_3', 0.009),
 ('SEM_SMA_3_lag1', 0.009),
 ('SEM_SMA_3_lag2', -0.004),
 ('SEM_SMA_3_lag3', -0.007),
 ('SEM_SMA_5', 0.003),
 ('SEM_SMA_5_lag1', 0.003),
 ('SEM_SMA_5_lag2', 0.0),
 ('SEM_SMA_5_lag3', -0.008),
 ('SEM_EMA_8', 0.009),
 ('SEM_EMA_8_lag1', 0.015),
 ('SEM_EMA_8_lag2', 0.004),
 ('SEM_EMA_8_lag3', 0.006),
 ('SEM_Ad_Stock', 0.005),
 ('SEM_Ad_Stock_lag1', 0.018),
 ('SEM_Ad_Stock_lag2', -0.003),
 ('SEM_Ad_Stock_lag3', -0.003),
 ('Radio', -0.003),
 ('Radio_lag1', 0.008),
 ('Radio_lag2', -0.012),
 ('Radio_lag3', 0.015),
 ('Radio_SMA_3', -0.003),
 ('Radio_SMA_3_lag1', 0.005),
 ('Radio_SMA_3_lag2', 0.005),
 ('Radio_SMA_3_lag3', -0.002),
 ('Radio_SMA_5', 0.005),
 ('Radio_SMA_5_lag1', -0.0),
 ('Radio_SMA_5_lag2', -0.001),
 ('Radio_SMA_5_lag3', 0.014),
 ('Radio_EMA_8', -0.0),
 ('Radio_EMA_8_lag1', 0.002),
 ('Radio_EMA_8_lag2', -0.003),
 ('Radio_EMA_8_lag3', 0.001),
 ('Radio_Ad_Stock', 0.0),
 ('Radio_Ad_Stock_lag1', 0.004),
 ('Radio_Ad_Stock_lag2', -0.002),
 ('Radio_Ad_Stock_lag3', 0.007),
 ('Other', -0.002),
 ('Other_lag1', 0.018),
 ('Other_lag2', -0.024),
 ('Other_lag3', 0.013),
 ('Other_SMA_3', -0.003),
 ('Other_SMA_3_lag1', 0.003),
 ('Other_SMA_3_lag2', -0.002),
 ('Other_SMA_3_lag3', -0.008),
 ('Other_SMA_5', 0.003),
 ('Other_SMA_5_lag1', -0.005),
 ('Other_SMA_5_lag2', -0.009),
 ('Other_SMA_5_lag3', 0.008),
 ('Other_EMA_8', -0.002),
 ('Other_EMA_8_lag1', -0.001),
 ('Other_EMA_8_lag2', -0.01),
 ('Other_EMA_8_lag3', -0.003),
 ('Other_Ad_Stock', 0.001),
 ('Other_Ad_Stock_lag1', 0.003),
 ('Other_Ad_Stock_lag2', -0.011),
 ('Other_Ad_Stock_lag3', 0.003),
 ('NPS', 0.009),
 ('NPS_lag1', -0.013),
 ('NPS_lag2', 0.001),
 ('NPS_lag3', -0.0),
 ('NPS_SMA_3', 0.004),
 ('NPS_SMA_3_lag1', 0.003),
 ('NPS_SMA_3_lag2', -0.015),
 ('NPS_SMA_3_lag3', -0.022),
 ('NPS_SMA_5', -0.014),
 ('NPS_SMA_5_lag1', -0.021),
 ('NPS_SMA_5_lag2', -0.016),
 ('NPS_SMA_5_lag3', -0.011),
 ('Stock Index', -0.036),
 ('Stock Index_lag1', -0.001),
 ('Stock Index_lag2', 0.002),
 ('Stock Index_lag3', 0.012),
 ('Stock Index_SMA_3', 0.002),
 ('Stock Index_SMA_3_lag1', 0.01),
 ('Stock Index_SMA_3_lag2', -0.011),
 ('Stock Index_SMA_3_lag3', -0.012),
 ('Stock Index_SMA_5', -0.012),
 ('Stock Index_SMA_5_lag1', -0.013),
 ('Stock Index_SMA_5_lag2', -0.01),
 ('Stock Index_SMA_5_lag3', -0.007),
 ('Max Temp', 0.018),
 ('Max Temp_lag1', -0.019),
 ('Max Temp_lag2', 0.003),
 ('Max Temp_lag3', -0.002),
 ('Min Temp', -0.001),
 ('Min Temp_lag1', -0.017),
 ('Min Temp_lag2', 0.0),
 ('Min Temp_lag3', 0.009),
 ('Mean Temp', 0.022),
 ('Mean Temp_lag1', -0.027),
 ('Mean Temp_lag2', -0.0),
 ('Mean Temp_lag3', 0.011),
 ('Heat Deg Days', -0.023),
 ('Heat Deg Days_lag1', 0.025),
 ('Heat Deg Days_lag2', 0.001),
 ('Heat Deg Days_lag3', -0.005),
 ('Cool Deg Days', 0.014),
 ('Cool Deg Days_lag1', -0.019),
 ('Cool Deg Days_lag2', -0.007),
 ('Cool Deg Days_lag3', 0.016),
 ('Total Rain (mm)', -0.003),
 ('Total Rain (mm)_lag1', -0.009),
 ('Total Rain (mm)_lag2', 0.012),
 ('Total Rain (mm)_lag3', -0.002),
 ('Total Snow (cm)', 0.016),
 ('Total Snow (cm)_lag1', -0.004),
 ('Total Snow (cm)_lag2', 0.003),
 ('Total Snow (cm)_lag3', 0.017),
 ('Total Precip (mm)', 0.001),
 ('Total Precip (mm)_lag1', -0.01),
 ('Total Precip (mm)_lag2', 0.012),
 ('Total Precip (mm)_lag3', 0.001),
 ('Snow on Grnd (cm)', 0.017),
 ('Snow on Grnd (cm)_lag1', -0.003),
 ('Snow on Grnd (cm)_lag2', 0.014),
 ('Snow on Grnd (cm)_lag3', 0.006),
 ('Sale', -0.002),
 ('Sale_lag1', -0.005),
 ('Sale_lag2', -0.043),
 ('Sale_lag3', -0.0)]
In [369]:
homeaudio_lr_coef_df = pd.DataFrame(homeaudio_lr_coef)
col_rename = {0:'Features',1: 'Coefficients'}
homeaudio_lr_coef_df = homeaudio_lr_coef_df.rename(columns=col_rename)
homeaudio_lr_coef_df = homeaudio_lr_coef_df.iloc[1:,:]
homeaudio_lr_coef_df = homeaudio_lr_coef_df.loc[homeaudio_lr_coef_df['Coefficients']!=0.0]
homeaudio_lr_coef_df = homeaudio_lr_coef_df.sort_values(by=['Coefficients'], ascending = False)
homeaudio_lr_coef_df
Out[369]:
Features Coefficients
52 product_vertical_homeaudiospeaker 0.184
56 product_vertical_karaokeplayer 0.173
28 is_mass_market 0.162
24 is_cod 0.145
44 product_vertical_fmradio 0.094
4 Discount% 0.073
36 product_vertical_dock 0.067
48 product_vertical_hifisystem 0.066
121 Digital_lag1 0.049
221 SEM_lag1 0.037
43 product_vertical_dockingstation_lag3 0.035
161 Content Marketing_lag1 0.033
33 product_vertical_djcontroller_lag1 0.032
69 product_vertical_voicerecorder_lag1 0.032
68 product_vertical_voicerecorder 0.032
7 Discount%_lag3 0.027
32 product_vertical_djcontroller 0.027
317 Heat Deg Days_lag1 0.025
18 sla_lag2 0.025
23 product_procurement_sla_lag3 0.023
137 Digital_Ad_Stock_lag1 0.022
64 product_vertical_soundmixer 0.022
312 Mean Temp 0.022
21 product_procurement_sla_lag1 0.022
40 product_vertical_dockingstation 0.020
304 Max Temp 0.018
261 Other_lag1 0.018
237 SEM_Ad_Stock_lag1 0.018
177 Content_Marketing_Ad_Stock_lag1 0.017
50 product_vertical_hifisystem_lag2 0.017
336 Snow on Grnd (cm) 0.017
331 Total Snow (cm)_lag3 0.017
328 Total Snow (cm) 0.016
323 Cool Deg Days_lag3 0.016
133 Digital_EMA_8_lag1 0.016
173 Content Marketing_EMA_8_lag1 0.015
37 product_vertical_dock_lag1 0.015
233 SEM_EMA_8_lag1 0.015
243 Radio_lag3 0.015
338 Snow on Grnd (cm)_lag2 0.014
320 Cool Deg Days 0.014
251 Radio_SMA_5_lag3 0.014
263 Other_lag3 0.013
203 Affiliates_lag3 0.013
183 Online marketing_lag3 0.012
334 Total Precip (mm)_lag2 0.012
326 Total Rain (mm)_lag2 0.012
295 Stock Index_lag3 0.012
187 Online marketing_SMA_3_lag3 0.012
124 Digital_SMA_3 0.011
315 Mean Temp_lag3 0.011
61 product_vertical_slingbox_lag1 0.011
207 Affiliates_SMA_3_lag3 0.011
297 Stock Index_SMA_3_lag1 0.010
125 Digital_SMA_3_lag1 0.010
182 Online marketing_lag2 0.009
103 TV_lag3 0.009
224 SEM_SMA_3 0.009
225 SEM_SMA_3_lag1 0.009
198 Online_marketing_Ad_Stock_lag2 0.009
65 product_vertical_soundmixer_lag1 0.009
311 Min Temp_lag3 0.009
190 Online marketing_SMA_5_lag2 0.009
142 Sponsorship_lag2 0.009
280 NPS 0.009
232 SEM_EMA_8 0.009
241 Radio_lag1 0.008
189 Online marketing_SMA_5_lag1 0.008
132 Digital_EMA_8 0.008
154 Sponsorship_EMA_8_lag2 0.008
271 Other_SMA_5_lag3 0.008
199 Online_marketing_Ad_Stock_lag3 0.008
101 TV_lag1 0.008
164 Content Marketing_SMA_3 0.008
140 Sponsorship 0.007
165 Content Marketing_SMA_3_lag1 0.007
209 Affiliates_SMA_5_lag1 0.007
219 Affiliates_Ad_Stock_lag3 0.007
45 product_vertical_fmradio_lag1 0.007
194 Online marketing_EMA_8_lag2 0.007
259 Radio_Ad_Stock_lag3 0.007
218 Affiliates_Ad_Stock_lag2 0.007
169 Content Marketing_SMA_5_lag1 0.006
172 Content Marketing_EMA_8 0.006
185 Online marketing_SMA_3_lag1 0.006
215 Affiliates_EMA_8_lag3 0.006
195 Online marketing_EMA_8_lag3 0.006
155 Sponsorship_EMA_8_lag3 0.006
202 Affiliates_lag2 0.006
150 Sponsorship_SMA_5_lag2 0.006
210 Affiliates_SMA_5_lag2 0.006
214 Affiliates_EMA_8_lag2 0.006
75 payday_week_lag3 0.006
95 Total Investment_EMA_8_lag3 0.006
339 Snow on Grnd (cm)_lag3 0.006
235 SEM_EMA_8_lag3 0.006
136 Digital_Ad_Stock 0.005
158 Sponsorship_Ad_Stock_lag2 0.005
248 Radio_SMA_5 0.005
168 Content Marketing_SMA_5 0.005
246 Radio_SMA_3_lag2 0.005
245 Radio_SMA_3_lag1 0.005
186 Online marketing_SMA_3_lag2 0.005
191 Online marketing_SMA_5_lag3 0.005
236 SEM_Ad_Stock 0.005
205 Affiliates_SMA_3_lag1 0.005
93 Total Investment_EMA_8_lag1 0.005
94 Total Investment_EMA_8_lag2 0.005
193 Online marketing_EMA_8_lag1 0.004
27 is_cod_lag3 0.004
234 SEM_EMA_8_lag2 0.004
211 Affiliates_SMA_5_lag3 0.004
257 Radio_Ad_Stock_lag1 0.004
284 NPS_SMA_3 0.004
213 Affiliates_EMA_8_lag1 0.004
170 Content Marketing_SMA_5_lag2 0.004
81 Total Investment_lag1 0.004
92 Total Investment_EMA_8 0.004
77 holiday_week_lag1 0.004
277 Other_Ad_Stock_lag1 0.003
174 Content Marketing_EMA_8_lag2 0.003
175 Content Marketing_EMA_8_lag3 0.003
176 Content_Marketing_Ad_Stock 0.003
306 Max Temp_lag2 0.003
90 Total Investment_SMA_5_lag2 0.003
192 Online marketing_EMA_8 0.003
268 Other_SMA_5 0.003
152 Sponsorship_EMA_8 0.003
197 Online_marketing_Ad_Stock_lag1 0.003
229 SEM_SMA_5_lag1 0.003
228 SEM_SMA_5 0.003
206 Affiliates_SMA_3_lag2 0.003
97 Total_Investment_Ad_Stock_lag1 0.003
279 Other_Ad_Stock_lag3 0.003
285 NPS_SMA_3_lag1 0.003
128 Digital_SMA_5 0.003
330 Total Snow (cm)_lag2 0.003
135 Digital_EMA_8_lag3 0.003
265 Other_SMA_3_lag1 0.003
96 Total_Investment_Ad_Stock 0.002
217 Affiliates_Ad_Stock_lag1 0.002
212 Affiliates_EMA_8 0.002
294 Stock Index_lag2 0.002
296 Stock Index_SMA_3 0.002
12 deliverycdays 0.002
10 deliverybdays_lag2 0.002
84 Total Investment_SMA_3 0.002
14 deliverycdays_lag2 0.002
99 Total_Investment_Ad_Stock_lag3 0.002
85 Total Investment_SMA_3_lag1 0.002
47 product_vertical_fmradio_lag3 0.002
147 Sponsorship_SMA_3_lag3 0.002
253 Radio_EMA_8_lag1 0.002
8 deliverybdays 0.002
335 Total Precip (mm)_lag3 0.001
41 product_vertical_dockingstation_lag1 0.001
188 Online marketing_SMA_5 0.001
89 Total Investment_SMA_5_lag1 0.001
282 NPS_lag2 0.001
100 TV 0.001
119 TV_Ad_Stock_lag3 0.001
276 Other_Ad_Stock 0.001
156 Sponsorship_Ad_Stock 0.001
162 Content Marketing_lag2 0.001
144 Sponsorship_SMA_3 0.001
255 Radio_EMA_8_lag3 0.001
318 Heat Deg Days_lag2 0.001
98 Total_Investment_Ad_Stock_lag2 0.001
129 Digital_SMA_5_lag1 0.001
332 Total Precip (mm) 0.001
115 TV_EMA_8_lag3 0.001
55 product_vertical_homeaudiospeaker_lag3 -0.001
80 Total Investment -0.001
166 Content Marketing_SMA_3_lag2 -0.001
107 TV_SMA_3_lag3 -0.001
293 Stock Index_lag1 -0.001
273 Other_EMA_8_lag1 -0.001
149 Sponsorship_SMA_5_lag1 -0.001
250 Radio_SMA_5_lag2 -0.001
86 Total Investment_SMA_3_lag2 -0.001
178 Content_Marketing_Ad_Stock_lag2 -0.001
134 Digital_EMA_8_lag2 -0.001
204 Affiliates_SMA_3 -0.001
116 TV_Ad_Stock -0.001
308 Min Temp -0.001
340 Sale -0.002
3 gmv_lag3 -0.002
266 Other_SMA_3_lag2 -0.002
26 is_cod_lag2 -0.002
105 TV_SMA_3_lag1 -0.002
179 Content_Marketing_Ad_Stock_lag3 -0.002
148 Sponsorship_SMA_5 -0.002
247 Radio_SMA_3_lag3 -0.002
112 TV_EMA_8 -0.002
258 Radio_Ad_Stock_lag2 -0.002
260 Other -0.002
307 Max Temp_lag3 -0.002
272 Other_EMA_8 -0.002
327 Total Rain (mm)_lag3 -0.002
254 Radio_EMA_8_lag2 -0.003
239 SEM_Ad_Stock_lag3 -0.003
264 Other_SMA_3 -0.003
238 SEM_Ad_Stock_lag2 -0.003
222 SEM_lag2 -0.003
275 Other_EMA_8_lag3 -0.003
244 Radio_SMA_3 -0.003
17 sla_lag1 -0.003
74 payday_week_lag2 -0.003
22 product_procurement_sla_lag2 -0.003
240 Radio -0.003
108 TV_SMA_5 -0.003
113 TV_EMA_8_lag1 -0.003
324 Total Rain (mm) -0.003
62 product_vertical_slingbox_lag2 -0.003
117 TV_Ad_Stock_lag1 -0.003
337 Snow on Grnd (cm)_lag1 -0.003
91 Total Investment_SMA_5_lag3 -0.004
70 product_vertical_voicerecorder_lag2 -0.004
67 product_vertical_soundmixer_lag3 -0.004
329 Total Snow (cm)_lag1 -0.004
63 product_vertical_slingbox_lag3 -0.004
6 Discount%_lag2 -0.004
143 Sponsorship_lag3 -0.004
145 Sponsorship_SMA_3_lag1 -0.004
167 Content Marketing_SMA_3_lag3 -0.004
226 SEM_SMA_3_lag2 -0.004
13 deliverycdays_lag1 -0.004
9 deliverybdays_lag1 -0.005
200 Affiliates -0.005
171 Content Marketing_SMA_5_lag3 -0.005
111 TV_SMA_5_lag3 -0.005
31 is_mass_market_lag3 -0.005
130 Digital_SMA_5_lag2 -0.005
157 Sponsorship_Ad_Stock_lag1 -0.005
104 TV_SMA_3 -0.005
341 Sale_lag1 -0.005
319 Heat Deg Days_lag3 -0.005
109 TV_SMA_5_lag1 -0.005
269 Other_SMA_5_lag1 -0.005
180 Online marketing -0.005
201 Affiliates_lag1 -0.005
78 holiday_week_lag2 -0.006
181 Online marketing_lag1 -0.006
73 payday_week_lag1 -0.006
114 TV_EMA_8_lag2 -0.006
106 TV_SMA_3_lag2 -0.007
139 Digital_Ad_Stock_lag3 -0.007
151 Sponsorship_SMA_5_lag3 -0.007
46 product_vertical_fmradio_lag2 -0.007
5 Discount%_lag1 -0.007
227 SEM_SMA_3_lag3 -0.007
303 Stock Index_SMA_5_lag3 -0.007
322 Cool Deg Days_lag2 -0.007
39 product_vertical_dock_lag3 -0.008
126 Digital_SMA_3_lag2 -0.008
231 SEM_SMA_5_lag3 -0.008
267 Other_SMA_3_lag3 -0.008
270 Other_SMA_5_lag2 -0.009
325 Total Rain (mm)_lag1 -0.009
138 Digital_Ad_Stock_lag2 -0.009
122 Digital_lag2 -0.009
110 TV_SMA_5_lag2 -0.009
302 Stock Index_SMA_5_lag2 -0.010
15 deliverycdays_lag3 -0.010
333 Total Precip (mm)_lag1 -0.010
11 deliverybdays_lag3 -0.010
274 Other_EMA_8_lag2 -0.010
291 NPS_SMA_5_lag3 -0.011
118 TV_Ad_Stock_lag2 -0.011
298 Stock Index_SMA_3_lag2 -0.011
54 product_vertical_homeaudiospeaker_lag2 -0.011
71 product_vertical_voicerecorder_lag3 -0.011
278 Other_Ad_Stock_lag2 -0.011
30 is_mass_market_lag2 -0.011
300 Stock Index_SMA_5 -0.012
38 product_vertical_dock_lag2 -0.012
299 Stock Index_SMA_3_lag3 -0.012
223 SEM_lag3 -0.012
242 Radio_lag2 -0.012
220 SEM -0.012
301 Stock Index_SMA_5_lag1 -0.013
127 Digital_SMA_3_lag3 -0.013
281 NPS_lag1 -0.013
160 Content Marketing -0.013
288 NPS_SMA_5 -0.014
131 Digital_SMA_5_lag3 -0.014
163 Content Marketing_lag3 -0.014
60 product_vertical_slingbox -0.015
2 gmv_lag2 -0.015
286 NPS_SMA_3_lag2 -0.015
72 payday_week -0.015
120 Digital -0.015
290 NPS_SMA_5_lag2 -0.016
123 Digital_lag3 -0.016
141 Sponsorship_lag1 -0.016
25 is_cod_lag1 -0.017
66 product_vertical_soundmixer_lag2 -0.017
309 Min Temp_lag1 -0.017
16 sla -0.018
321 Cool Deg Days_lag1 -0.019
57 product_vertical_karaokeplayer_lag1 -0.019
305 Max Temp_lag1 -0.019
29 is_mass_market_lag1 -0.020
19 sla_lag3 -0.020
289 NPS_SMA_5_lag1 -0.021
287 NPS_SMA_3_lag3 -0.022
316 Heat Deg Days -0.023
262 Other_lag2 -0.024
76 holiday_week -0.024
102 TV_lag2 -0.024
58 product_vertical_karaokeplayer_lag2 -0.027
313 Mean Temp_lag1 -0.027
59 product_vertical_karaokeplayer_lag3 -0.030
1 gmv_lag1 -0.031
49 product_vertical_hifisystem_lag1 -0.032
53 product_vertical_homeaudiospeaker_lag1 -0.033
51 product_vertical_hifisystem_lag3 -0.033
42 product_vertical_dockingstation_lag2 -0.033
79 holiday_week_lag3 -0.035
35 product_vertical_djcontroller_lag3 -0.035
292 Stock Index -0.036
34 product_vertical_djcontroller_lag2 -0.036
20 product_procurement_sla -0.040
342 Sale_lag2 -0.043

Plotting the Features in descending order of Importance for homeaudio

In [370]:
# Slightly alter the figure size to make it more horizontal.
plt.figure(figsize=(10, 35), dpi=100, facecolor='w', edgecolor='k', frameon='True')
sns.barplot(y='Features', x='Coefficients', palette='husl', data=homeaudio_lr_coef_df, estimator=np.sum)
# Automatically adjust subplot params so that the subplotS fits in to the figure area.
plt.tight_layout()

# display the plot
plt.show()
The 5 most important features affecting GMV(Revenue) for homeaudio are:
Features Coefficients
product_vertical_homeaudiospeaker 0.184
product_vertical_karaokeplayer 0.173
is_mass_market 0.162
is_cod 0.145
product_vertical_fmradio 0.094
In [ ]:
 
In [ ]:
 

Distributive Lag Model(Multiplicative)

The Distributive Lag Model(Additive) helped us capture the not only the current, but also the carry-over effect of all the variables(depedent and independent).

Yt = α+ µ1Yt-1 + µ2Yt-2 + µ3Yt-3 + ....

    + β1X1t + β1X1t-1 + β1X1t-2 + ....

    + β2X2t + β2X2t-1 + β2X2t-2 + ....

    + β3X3t + β3X3t-1 + β3X3t-2 + ....

    + β4X4t + β4X4t-1 + β4X4t-2 + ....

    + β5X5t + β5X5t-1 + β5X5t-2 + ....

    + ϵ

Now the Distributive Lag Model(Multiplicative) will help us capture the interactions between current and carry over effects of the KPIs.

To fit a multiplicative model, take logarithms of the data(on both sides of the model), then analyse the log data as before.

Yt = α+ µ1ln(Yt-1) + µ2ln(Yt-2) + µ3ln(Yt-3) + ....

    + β1ln(X1t) + β1ln(X1t-1) + β1ln(X1t-2) + ....

    + β2ln(X2t) + β2ln(X2t-1) + β2ln(X2t-2) + ....

    + β3ln(X3t) + β3ln(X3t-1) + β3ln(X3t-2) + ....

    + β4ln(X4t) + β4ln(X4t-1) + β4ln(X4t-2) + ....

    + β5ln(X5t) + β5ln(X5t-1) + β5ln(X5t-2) + ....

    + ϵ'
In [371]:
homeaudio_org_df.head()
Out[371]:
Week gmv Discount% deliverybdays deliverycdays sla product_procurement_sla is_cod is_mass_market product_vertical_djcontroller product_vertical_dock product_vertical_dockingstation product_vertical_fmradio product_vertical_hifisystem product_vertical_homeaudiospeaker product_vertical_karaokeplayer product_vertical_slingbox product_vertical_soundmixer product_vertical_voicerecorder payday_week holiday_week Total Investment Total Investment_SMA_3 Total Investment_SMA_5 Total Investment_EMA_8 Total_Investment_Ad_Stock TV TV_SMA_3 TV_SMA_5 TV_EMA_8 TV_Ad_Stock Digital Digital_SMA_3 Digital_SMA_5 Digital_EMA_8 Digital_Ad_Stock Sponsorship Sponsorship_SMA_3 Sponsorship_SMA_5 Sponsorship_EMA_8 Sponsorship_Ad_Stock Content Marketing Content Marketing_SMA_3 Content Marketing_SMA_5 Content Marketing_EMA_8 Content_Marketing_Ad_Stock Online marketing Online marketing_SMA_3 Online marketing_SMA_5 Online marketing_EMA_8 Online_marketing_Ad_Stock Affiliates Affiliates_SMA_3 Affiliates_SMA_5 Affiliates_EMA_8 Affiliates_Ad_Stock SEM SEM_SMA_3 SEM_SMA_5 SEM_EMA_8 SEM_Ad_Stock Radio Radio_SMA_3 Radio_SMA_5 Radio_EMA_8 Radio_Ad_Stock Other Other_SMA_3 Other_SMA_5 Other_EMA_8 Other_Ad_Stock NPS NPS_SMA_3 NPS_SMA_5 Stock Index Stock Index_SMA_3 Stock Index_SMA_5 Max Temp Min Temp Mean Temp Heat Deg Days Cool Deg Days Total Rain (mm) Total Snow (cm) Total Precip (mm) Snow on Grnd (cm) Sale
25 28 4573783.133 31.451 0.000 0.000 7.369 2.863 1583 1366 8 33 1 516.000 23 1374.000 0 0 0 63 0 0 4.265 0.000 0.000 4.265 4.265 0.054 0.000 0.000 0.054 0.054 0.633 0.000 0.000 0.633 0.633 1.854 0.000 0.000 1.854 1.854 0.000 0.000 0.000 0.000 0.000 0.332 0.000 0.000 0.332 0.332 0.137 0.000 0.000 0.137 0.137 1.256 0.000 0.000 1.256 1.256 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 0.000 0.000 1177.000 0.000 0.000 28.000 12.500 20.100 0.283 2.383 4.417 0.000 4.417 0.000 0
26 29 5371525.000 32.967 0.000 0.000 6.985 2.746 1868 1610 7 50 1 574.000 42 1623.000 0 0 0 69 1 0 4.265 0.000 0.000 4.265 6.824 0.054 0.000 0.000 0.054 0.086 0.633 0.000 0.000 0.633 1.013 1.854 0.000 0.000 1.854 2.966 0.000 0.000 0.000 0.000 0.000 0.332 0.000 0.000 0.332 0.531 0.137 0.000 0.000 0.137 0.219 1.256 0.000 0.000 1.256 2.010 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 0.000 0.000 1177.000 0.000 0.000 33.000 11.000 23.183 0.000 5.183 1.400 0.000 1.400 0.000 2
27 30 4679828.000 32.357 0.000 0.000 7.072 2.861 1758 1569 4 56 0 577.000 36 1430.000 0 0 0 46 0 0 4.265 4.265 0.000 4.265 8.359 0.054 0.054 0.000 0.054 0.106 0.633 0.633 0.000 0.633 1.241 1.854 1.854 0.000 1.854 3.634 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.000 0.332 0.651 0.137 0.137 0.000 0.137 0.269 1.256 1.256 0.000 1.256 2.462 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 54.600 0.000 1177.000 1177.000 0.000 31.500 14.500 23.060 0.000 5.060 1.080 0.000 1.080 0.000 0
28 31 3451151.000 32.208 0.000 0.000 7.201 2.735 1244 1072 2 43 0 420.000 20 1025.000 0 0 0 44 1 0 4.265 4.265 0.000 4.265 9.281 0.054 0.054 0.000 0.054 0.118 0.633 0.633 0.000 0.633 1.377 1.854 1.854 0.000 1.854 4.034 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.000 0.332 0.722 0.137 0.137 0.000 0.137 0.298 1.256 1.256 0.000 1.256 2.733 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 54.600 0.000 1177.000 1177.000 0.000 33.500 16.000 24.567 0.000 6.567 4.633 0.000 4.633 0.000 0
29 32 2599.000 16.130 0.000 0.000 9.000 2.000 0 0 0 0 0 0.000 0 1.000 0 0 0 0 0 0 1.013 3.181 3.615 3.542 6.581 0.001 0.036 0.043 0.042 0.072 0.256 0.507 0.558 0.549 1.082 0.213 1.307 1.526 1.489 2.634 0.000 0.000 0.000 0.000 0.000 0.026 0.230 0.271 0.264 0.459 0.015 0.096 0.113 0.110 0.194 0.503 1.005 1.105 1.089 2.143 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 59.987 56.395 55.677 1206.000 1186.667 1182.800 28.500 15.000 21.650 0.000 3.650 0.350 0.000 0.350 0.000 0
In [372]:
# Making copy of dataframes from the original ones
cameraaccessory_dlmul_df = cameraaccessory_org_df.copy()
gamingaccessory_dlmul_df = gamingaccessory_org_df.copy()
homeaudio_dlmul_df = homeaudio_org_df.copy()
homeaudio_dlmul_df.head()
Out[372]:
Week gmv Discount% deliverybdays deliverycdays sla product_procurement_sla is_cod is_mass_market product_vertical_djcontroller product_vertical_dock product_vertical_dockingstation product_vertical_fmradio product_vertical_hifisystem product_vertical_homeaudiospeaker product_vertical_karaokeplayer product_vertical_slingbox product_vertical_soundmixer product_vertical_voicerecorder payday_week holiday_week Total Investment Total Investment_SMA_3 Total Investment_SMA_5 Total Investment_EMA_8 Total_Investment_Ad_Stock TV TV_SMA_3 TV_SMA_5 TV_EMA_8 TV_Ad_Stock Digital Digital_SMA_3 Digital_SMA_5 Digital_EMA_8 Digital_Ad_Stock Sponsorship Sponsorship_SMA_3 Sponsorship_SMA_5 Sponsorship_EMA_8 Sponsorship_Ad_Stock Content Marketing Content Marketing_SMA_3 Content Marketing_SMA_5 Content Marketing_EMA_8 Content_Marketing_Ad_Stock Online marketing Online marketing_SMA_3 Online marketing_SMA_5 Online marketing_EMA_8 Online_marketing_Ad_Stock Affiliates Affiliates_SMA_3 Affiliates_SMA_5 Affiliates_EMA_8 Affiliates_Ad_Stock SEM SEM_SMA_3 SEM_SMA_5 SEM_EMA_8 SEM_Ad_Stock Radio Radio_SMA_3 Radio_SMA_5 Radio_EMA_8 Radio_Ad_Stock Other Other_SMA_3 Other_SMA_5 Other_EMA_8 Other_Ad_Stock NPS NPS_SMA_3 NPS_SMA_5 Stock Index Stock Index_SMA_3 Stock Index_SMA_5 Max Temp Min Temp Mean Temp Heat Deg Days Cool Deg Days Total Rain (mm) Total Snow (cm) Total Precip (mm) Snow on Grnd (cm) Sale
25 28 4573783.133 31.451 0.000 0.000 7.369 2.863 1583 1366 8 33 1 516.000 23 1374.000 0 0 0 63 0 0 4.265 0.000 0.000 4.265 4.265 0.054 0.000 0.000 0.054 0.054 0.633 0.000 0.000 0.633 0.633 1.854 0.000 0.000 1.854 1.854 0.000 0.000 0.000 0.000 0.000 0.332 0.000 0.000 0.332 0.332 0.137 0.000 0.000 0.137 0.137 1.256 0.000 0.000 1.256 1.256 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 0.000 0.000 1177.000 0.000 0.000 28.000 12.500 20.100 0.283 2.383 4.417 0.000 4.417 0.000 0
26 29 5371525.000 32.967 0.000 0.000 6.985 2.746 1868 1610 7 50 1 574.000 42 1623.000 0 0 0 69 1 0 4.265 0.000 0.000 4.265 6.824 0.054 0.000 0.000 0.054 0.086 0.633 0.000 0.000 0.633 1.013 1.854 0.000 0.000 1.854 2.966 0.000 0.000 0.000 0.000 0.000 0.332 0.000 0.000 0.332 0.531 0.137 0.000 0.000 0.137 0.219 1.256 0.000 0.000 1.256 2.010 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 0.000 0.000 1177.000 0.000 0.000 33.000 11.000 23.183 0.000 5.183 1.400 0.000 1.400 0.000 2
27 30 4679828.000 32.357 0.000 0.000 7.072 2.861 1758 1569 4 56 0 577.000 36 1430.000 0 0 0 46 0 0 4.265 4.265 0.000 4.265 8.359 0.054 0.054 0.000 0.054 0.106 0.633 0.633 0.000 0.633 1.241 1.854 1.854 0.000 1.854 3.634 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.000 0.332 0.651 0.137 0.137 0.000 0.137 0.269 1.256 1.256 0.000 1.256 2.462 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 54.600 0.000 1177.000 1177.000 0.000 31.500 14.500 23.060 0.000 5.060 1.080 0.000 1.080 0.000 0
28 31 3451151.000 32.208 0.000 0.000 7.201 2.735 1244 1072 2 43 0 420.000 20 1025.000 0 0 0 44 1 0 4.265 4.265 0.000 4.265 9.281 0.054 0.054 0.000 0.054 0.118 0.633 0.633 0.000 0.633 1.377 1.854 1.854 0.000 1.854 4.034 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.000 0.332 0.722 0.137 0.137 0.000 0.137 0.298 1.256 1.256 0.000 1.256 2.733 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 54.600 0.000 1177.000 1177.000 0.000 33.500 16.000 24.567 0.000 6.567 4.633 0.000 4.633 0.000 0
29 32 2599.000 16.130 0.000 0.000 9.000 2.000 0 0 0 0 0 0.000 0 1.000 0 0 0 0 0 0 1.013 3.181 3.615 3.542 6.581 0.001 0.036 0.043 0.042 0.072 0.256 0.507 0.558 0.549 1.082 0.213 1.307 1.526 1.489 2.634 0.000 0.000 0.000 0.000 0.000 0.026 0.230 0.271 0.264 0.459 0.015 0.096 0.113 0.110 0.194 0.503 1.005 1.105 1.089 2.143 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 59.987 56.395 55.677 1206.000 1186.667 1182.800 28.500 15.000 21.650 0.000 3.650 0.350 0.000 0.350 0.000 0
In [373]:
# Checking for total count and percentage of null values in all columns of the dataframe.

total = pd.DataFrame(homeaudio_dladd_df.isnull().sum().sort_values(ascending=False), columns=['Total'])
percentage = pd.DataFrame(round(100*(homeaudio_dladd_df.isnull().sum()/homeaudio_dladd_df.shape[0]),2).sort_values(ascending=False)\
                          ,columns=['Percentage'])

pd.concat([total, percentage], axis = 1).head()
Out[373]:
Total Percentage
Sale_lag3 0 0.000
TV_Ad_Stock_lag1 0 0.000
TV_SMA_5_lag1 0 0.000
TV_SMA_5_lag2 0 0.000
TV_SMA_5_lag3 0 0.000

We will drop the Week column as it is a row identifier and will not help in prediction of revenue

In [374]:
# removing columns
cameraaccessory_dlmul_df = cameraaccessory_dlmul_df.drop('Week', axis=1)
gamingaccessory_dlmul_df = gamingaccessory_dlmul_df.drop('Week', axis=1)
homeaudio_dlmul_df = homeaudio_dlmul_df.drop('Week', axis=1)
homeaudio_dlmul_df.head()
Out[374]:
gmv Discount% deliverybdays deliverycdays sla product_procurement_sla is_cod is_mass_market product_vertical_djcontroller product_vertical_dock product_vertical_dockingstation product_vertical_fmradio product_vertical_hifisystem product_vertical_homeaudiospeaker product_vertical_karaokeplayer product_vertical_slingbox product_vertical_soundmixer product_vertical_voicerecorder payday_week holiday_week Total Investment Total Investment_SMA_3 Total Investment_SMA_5 Total Investment_EMA_8 Total_Investment_Ad_Stock TV TV_SMA_3 TV_SMA_5 TV_EMA_8 TV_Ad_Stock Digital Digital_SMA_3 Digital_SMA_5 Digital_EMA_8 Digital_Ad_Stock Sponsorship Sponsorship_SMA_3 Sponsorship_SMA_5 Sponsorship_EMA_8 Sponsorship_Ad_Stock Content Marketing Content Marketing_SMA_3 Content Marketing_SMA_5 Content Marketing_EMA_8 Content_Marketing_Ad_Stock Online marketing Online marketing_SMA_3 Online marketing_SMA_5 Online marketing_EMA_8 Online_marketing_Ad_Stock Affiliates Affiliates_SMA_3 Affiliates_SMA_5 Affiliates_EMA_8 Affiliates_Ad_Stock SEM SEM_SMA_3 SEM_SMA_5 SEM_EMA_8 SEM_Ad_Stock Radio Radio_SMA_3 Radio_SMA_5 Radio_EMA_8 Radio_Ad_Stock Other Other_SMA_3 Other_SMA_5 Other_EMA_8 Other_Ad_Stock NPS NPS_SMA_3 NPS_SMA_5 Stock Index Stock Index_SMA_3 Stock Index_SMA_5 Max Temp Min Temp Mean Temp Heat Deg Days Cool Deg Days Total Rain (mm) Total Snow (cm) Total Precip (mm) Snow on Grnd (cm) Sale
25 4573783.133 31.451 0.000 0.000 7.369 2.863 1583 1366 8 33 1 516.000 23 1374.000 0 0 0 63 0 0 4.265 0.000 0.000 4.265 4.265 0.054 0.000 0.000 0.054 0.054 0.633 0.000 0.000 0.633 0.633 1.854 0.000 0.000 1.854 1.854 0.000 0.000 0.000 0.000 0.000 0.332 0.000 0.000 0.332 0.332 0.137 0.000 0.000 0.137 0.137 1.256 0.000 0.000 1.256 1.256 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 0.000 0.000 1177.000 0.000 0.000 28.000 12.500 20.100 0.283 2.383 4.417 0.000 4.417 0.000 0
26 5371525.000 32.967 0.000 0.000 6.985 2.746 1868 1610 7 50 1 574.000 42 1623.000 0 0 0 69 1 0 4.265 0.000 0.000 4.265 6.824 0.054 0.000 0.000 0.054 0.086 0.633 0.000 0.000 0.633 1.013 1.854 0.000 0.000 1.854 2.966 0.000 0.000 0.000 0.000 0.000 0.332 0.000 0.000 0.332 0.531 0.137 0.000 0.000 0.137 0.219 1.256 0.000 0.000 1.256 2.010 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 0.000 0.000 1177.000 0.000 0.000 33.000 11.000 23.183 0.000 5.183 1.400 0.000 1.400 0.000 2
27 4679828.000 32.357 0.000 0.000 7.072 2.861 1758 1569 4 56 0 577.000 36 1430.000 0 0 0 46 0 0 4.265 4.265 0.000 4.265 8.359 0.054 0.054 0.000 0.054 0.106 0.633 0.633 0.000 0.633 1.241 1.854 1.854 0.000 1.854 3.634 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.000 0.332 0.651 0.137 0.137 0.000 0.137 0.269 1.256 1.256 0.000 1.256 2.462 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 54.600 0.000 1177.000 1177.000 0.000 31.500 14.500 23.060 0.000 5.060 1.080 0.000 1.080 0.000 0
28 3451151.000 32.208 0.000 0.000 7.201 2.735 1244 1072 2 43 0 420.000 20 1025.000 0 0 0 44 1 0 4.265 4.265 0.000 4.265 9.281 0.054 0.054 0.000 0.054 0.118 0.633 0.633 0.000 0.633 1.377 1.854 1.854 0.000 1.854 4.034 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.000 0.332 0.722 0.137 0.137 0.000 0.137 0.298 1.256 1.256 0.000 1.256 2.733 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 54.600 0.000 1177.000 1177.000 0.000 33.500 16.000 24.567 0.000 6.567 4.633 0.000 4.633 0.000 0
29 2599.000 16.130 0.000 0.000 9.000 2.000 0 0 0 0 0 0.000 0 1.000 0 0 0 0 0 0 1.013 3.181 3.615 3.542 6.581 0.001 0.036 0.043 0.042 0.072 0.256 0.507 0.558 0.549 1.082 0.213 1.307 1.526 1.489 2.634 0.000 0.000 0.000 0.000 0.000 0.026 0.230 0.271 0.264 0.459 0.015 0.096 0.113 0.110 0.194 0.503 1.005 1.105 1.089 2.143 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 59.987 56.395 55.677 1206.000 1186.667 1182.800 28.500 15.000 21.650 0.000 3.650 0.350 0.000 0.350 0.000 0

Creating new lag(Lag of 3 weeks) variables for the dependent variable (GMV) as well as independent variables

In [375]:
cameraaccessory_dlmul_df_columns = cameraaccessory_dlmul_df.columns
gamingaccessory_dlmul_df_columns = gamingaccessory_dlmul_df.columns
homeaudio_dlmul_df_columns = homeaudio_dlmul_df.columns
In [376]:
cameraaccessory_dlmul_df = lag_variables(cameraaccessory_dlmul_df,cameraaccessory_dlmul_df_columns,3)
gamingaccessory_dlmul_df = lag_variables(gamingaccessory_dlmul_df,gamingaccessory_dlmul_df_columns,3)
homeaudio_dlmul_df = lag_variables(homeaudio_dlmul_df,homeaudio_dlmul_df_columns,3)
homeaudio_dlmul_df.head()
Out[376]:
gmv gmv_lag3 Discount% Discount%_lag3 deliverybdays deliverybdays_lag3 deliverycdays deliverycdays_lag3 sla sla_lag3 product_procurement_sla product_procurement_sla_lag3 is_cod is_cod_lag3 is_mass_market is_mass_market_lag3 product_vertical_djcontroller product_vertical_djcontroller_lag3 product_vertical_dock product_vertical_dock_lag3 product_vertical_dockingstation product_vertical_dockingstation_lag3 product_vertical_fmradio product_vertical_fmradio_lag3 product_vertical_hifisystem product_vertical_hifisystem_lag3 product_vertical_homeaudiospeaker product_vertical_homeaudiospeaker_lag3 product_vertical_karaokeplayer product_vertical_karaokeplayer_lag3 product_vertical_slingbox product_vertical_slingbox_lag3 product_vertical_soundmixer product_vertical_soundmixer_lag3 product_vertical_voicerecorder product_vertical_voicerecorder_lag3 payday_week payday_week_lag3 holiday_week holiday_week_lag3 Total Investment Total Investment_lag3 Total Investment_SMA_3 Total Investment_SMA_3_lag3 Total Investment_SMA_5 Total Investment_SMA_5_lag3 Total Investment_EMA_8 Total Investment_EMA_8_lag3 Total_Investment_Ad_Stock Total_Investment_Ad_Stock_lag3 TV TV_lag3 TV_SMA_3 TV_SMA_3_lag3 TV_SMA_5 TV_SMA_5_lag3 TV_EMA_8 TV_EMA_8_lag3 TV_Ad_Stock TV_Ad_Stock_lag3 Digital Digital_lag3 Digital_SMA_3 Digital_SMA_3_lag3 Digital_SMA_5 Digital_SMA_5_lag3 Digital_EMA_8 Digital_EMA_8_lag3 Digital_Ad_Stock Digital_Ad_Stock_lag3 Sponsorship Sponsorship_lag3 Sponsorship_SMA_3 Sponsorship_SMA_3_lag3 Sponsorship_SMA_5 Sponsorship_SMA_5_lag3 Sponsorship_EMA_8 Sponsorship_EMA_8_lag3 Sponsorship_Ad_Stock Sponsorship_Ad_Stock_lag3 Content Marketing Content Marketing_lag3 Content Marketing_SMA_3 Content Marketing_SMA_3_lag3 Content Marketing_SMA_5 Content Marketing_SMA_5_lag3 Content Marketing_EMA_8 Content Marketing_EMA_8_lag3 Content_Marketing_Ad_Stock Content_Marketing_Ad_Stock_lag3 Online marketing Online marketing_lag3 Online marketing_SMA_3 Online marketing_SMA_3_lag3 Online marketing_SMA_5 Online marketing_SMA_5_lag3 Online marketing_EMA_8 Online marketing_EMA_8_lag3 Online_marketing_Ad_Stock Online_marketing_Ad_Stock_lag3 Affiliates Affiliates_lag3 Affiliates_SMA_3 Affiliates_SMA_3_lag3 Affiliates_SMA_5 Affiliates_SMA_5_lag3 Affiliates_EMA_8 Affiliates_EMA_8_lag3 Affiliates_Ad_Stock Affiliates_Ad_Stock_lag3 SEM SEM_lag3 SEM_SMA_3 SEM_SMA_3_lag3 SEM_SMA_5 SEM_SMA_5_lag3 SEM_EMA_8 SEM_EMA_8_lag3 SEM_Ad_Stock SEM_Ad_Stock_lag3 Radio Radio_lag3 Radio_SMA_3 Radio_SMA_3_lag3 Radio_SMA_5 Radio_SMA_5_lag3 Radio_EMA_8 Radio_EMA_8_lag3 Radio_Ad_Stock Radio_Ad_Stock_lag3 Other Other_lag3 Other_SMA_3 Other_SMA_3_lag3 Other_SMA_5 Other_SMA_5_lag3 Other_EMA_8 Other_EMA_8_lag3 Other_Ad_Stock Other_Ad_Stock_lag3 NPS NPS_lag3 NPS_SMA_3 NPS_SMA_3_lag3 NPS_SMA_5 NPS_SMA_5_lag3 Stock Index Stock Index_lag3 Stock Index_SMA_3 Stock Index_SMA_3_lag3 Stock Index_SMA_5 Stock Index_SMA_5_lag3 Max Temp Max Temp_lag3 Min Temp Min Temp_lag3 Mean Temp Mean Temp_lag3 Heat Deg Days Heat Deg Days_lag3 Cool Deg Days Cool Deg Days_lag3 Total Rain (mm) Total Rain (mm)_lag3 Total Snow (cm) Total Snow (cm)_lag3 Total Precip (mm) Total Precip (mm)_lag3 Snow on Grnd (cm) Snow on Grnd (cm)_lag3 Sale Sale_lag3
25 4573783.133 nan 31.451 nan 0.000 nan 0.000 nan 7.369 nan 2.863 nan 1583 nan 1366 nan 8 nan 33 nan 1 nan 516.000 nan 23 nan 1374.000 nan 0 nan 0 nan 0 nan 63 nan 0 nan 0 nan 4.265 nan 0.000 nan 0.000 nan 4.265 nan 4.265 nan 0.054 nan 0.000 nan 0.000 nan 0.054 nan 0.054 nan 0.633 nan 0.000 nan 0.000 nan 0.633 nan 0.633 nan 1.854 nan 0.000 nan 0.000 nan 1.854 nan 1.854 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.332 nan 0.000 nan 0.000 nan 0.332 nan 0.332 nan 0.137 nan 0.000 nan 0.000 nan 0.137 nan 0.137 nan 1.256 nan 0.000 nan 0.000 nan 1.256 nan 1.256 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 54.600 nan 0.000 nan 0.000 nan 1177.000 nan 0.000 nan 0.000 nan 28.000 nan 12.500 nan 20.100 nan 0.283 nan 2.383 nan 4.417 nan 0.000 nan 4.417 nan 0.000 nan 0 nan
26 5371525.000 nan 32.967 nan 0.000 nan 0.000 nan 6.985 nan 2.746 nan 1868 nan 1610 nan 7 nan 50 nan 1 nan 574.000 nan 42 nan 1623.000 nan 0 nan 0 nan 0 nan 69 nan 1 nan 0 nan 4.265 nan 0.000 nan 0.000 nan 4.265 nan 6.824 nan 0.054 nan 0.000 nan 0.000 nan 0.054 nan 0.086 nan 0.633 nan 0.000 nan 0.000 nan 0.633 nan 1.013 nan 1.854 nan 0.000 nan 0.000 nan 1.854 nan 2.966 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.332 nan 0.000 nan 0.000 nan 0.332 nan 0.531 nan 0.137 nan 0.000 nan 0.000 nan 0.137 nan 0.219 nan 1.256 nan 0.000 nan 0.000 nan 1.256 nan 2.010 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 54.600 nan 0.000 nan 0.000 nan 1177.000 nan 0.000 nan 0.000 nan 33.000 nan 11.000 nan 23.183 nan 0.000 nan 5.183 nan 1.400 nan 0.000 nan 1.400 nan 0.000 nan 2 nan
27 4679828.000 nan 32.357 nan 0.000 nan 0.000 nan 7.072 nan 2.861 nan 1758 nan 1569 nan 4 nan 56 nan 0 nan 577.000 nan 36 nan 1430.000 nan 0 nan 0 nan 0 nan 46 nan 0 nan 0 nan 4.265 nan 4.265 nan 0.000 nan 4.265 nan 8.359 nan 0.054 nan 0.054 nan 0.000 nan 0.054 nan 0.106 nan 0.633 nan 0.633 nan 0.000 nan 0.633 nan 1.241 nan 1.854 nan 1.854 nan 0.000 nan 1.854 nan 3.634 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.332 nan 0.332 nan 0.000 nan 0.332 nan 0.651 nan 0.137 nan 0.137 nan 0.000 nan 0.137 nan 0.269 nan 1.256 nan 1.256 nan 0.000 nan 1.256 nan 2.462 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 0.000 nan 54.600 nan 54.600 nan 0.000 nan 1177.000 nan 1177.000 nan 0.000 nan 31.500 nan 14.500 nan 23.060 nan 0.000 nan 5.060 nan 1.080 nan 0.000 nan 1.080 nan 0.000 nan 0 nan
28 3451151.000 4573783.133 32.208 31.451 0.000 0.000 0.000 0.000 7.201 7.369 2.735 2.863 1244 1583.000 1072 1366.000 2 8.000 43 33.000 0 1.000 420.000 516.000 20 23.000 1025.000 1374.000 0 0.000 0 0.000 0 0.000 44 63.000 1 0.000 0 0.000 4.265 4.265 4.265 0.000 0.000 0.000 4.265 4.265 9.281 4.265 0.054 0.054 0.054 0.000 0.000 0.000 0.054 0.054 0.118 0.054 0.633 0.633 0.633 0.000 0.000 0.000 0.633 0.633 1.377 0.633 1.854 1.854 1.854 0.000 0.000 0.000 1.854 1.854 4.034 1.854 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.332 0.000 0.000 0.000 0.332 0.332 0.722 0.332 0.137 0.137 0.137 0.000 0.000 0.000 0.137 0.137 0.298 0.137 1.256 1.256 1.256 0.000 0.000 0.000 1.256 1.256 2.733 1.256 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 54.600 54.600 0.000 0.000 0.000 1177.000 1177.000 1177.000 0.000 0.000 0.000 33.500 28.000 16.000 12.500 24.567 20.100 0.000 0.283 6.567 2.383 4.633 4.417 0.000 0.000 4.633 4.417 0.000 0.000 0 0.000
29 2599.000 5371525.000 16.130 32.967 0.000 0.000 0.000 0.000 9.000 6.985 2.000 2.746 0 1868.000 0 1610.000 0 7.000 0 50.000 0 1.000 0.000 574.000 0 42.000 1.000 1623.000 0 0.000 0 0.000 0 0.000 0 69.000 0 1.000 0 0.000 1.013 4.265 3.181 0.000 3.615 0.000 3.542 4.265 6.581 6.824 0.001 0.054 0.036 0.000 0.043 0.000 0.042 0.054 0.072 0.086 0.256 0.633 0.507 0.000 0.558 0.000 0.549 0.633 1.082 1.013 0.213 1.854 1.307 0.000 1.526 0.000 1.489 1.854 2.634 2.966 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.026 0.332 0.230 0.000 0.271 0.000 0.264 0.332 0.459 0.531 0.015 0.137 0.096 0.000 0.113 0.000 0.110 0.137 0.194 0.219 0.503 1.256 1.005 0.000 1.105 0.000 1.089 1.256 2.143 2.010 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 59.987 54.600 56.395 0.000 55.677 0.000 1206.000 1177.000 1186.667 0.000 1182.800 0.000 28.500 33.000 15.000 11.000 21.650 23.183 0.000 0.000 3.650 5.183 0.350 1.400 0.000 0.000 0.350 1.400 0.000 0.000 0 2.000

Creating new lag(Lag of 2 weeks) variables for the dependent variable (GMV) as well as independent variables

In [377]:
cameraaccessory_dlmul_df = lag_variables(cameraaccessory_dlmul_df,cameraaccessory_dlmul_df_columns,2)
gamingaccessory_dlmul_df = lag_variables(gamingaccessory_dlmul_df,gamingaccessory_dlmul_df_columns,2)
homeaudio_dlmul_df = lag_variables(homeaudio_dlmul_df,homeaudio_dlmul_df_columns,2)
homeaudio_dlmul_df.head()
Out[377]:
gmv gmv_lag2 gmv_lag3 Discount% Discount%_lag2 Discount%_lag3 deliverybdays deliverybdays_lag2 deliverybdays_lag3 deliverycdays deliverycdays_lag2 deliverycdays_lag3 sla sla_lag2 sla_lag3 product_procurement_sla product_procurement_sla_lag2 product_procurement_sla_lag3 is_cod is_cod_lag2 is_cod_lag3 is_mass_market is_mass_market_lag2 is_mass_market_lag3 product_vertical_djcontroller product_vertical_djcontroller_lag2 product_vertical_djcontroller_lag3 product_vertical_dock product_vertical_dock_lag2 product_vertical_dock_lag3 product_vertical_dockingstation product_vertical_dockingstation_lag2 product_vertical_dockingstation_lag3 product_vertical_fmradio product_vertical_fmradio_lag2 product_vertical_fmradio_lag3 product_vertical_hifisystem product_vertical_hifisystem_lag2 product_vertical_hifisystem_lag3 product_vertical_homeaudiospeaker product_vertical_homeaudiospeaker_lag2 product_vertical_homeaudiospeaker_lag3 product_vertical_karaokeplayer product_vertical_karaokeplayer_lag2 product_vertical_karaokeplayer_lag3 product_vertical_slingbox product_vertical_slingbox_lag2 product_vertical_slingbox_lag3 product_vertical_soundmixer product_vertical_soundmixer_lag2 product_vertical_soundmixer_lag3 product_vertical_voicerecorder product_vertical_voicerecorder_lag2 product_vertical_voicerecorder_lag3 payday_week payday_week_lag2 payday_week_lag3 holiday_week holiday_week_lag2 holiday_week_lag3 Total Investment Total Investment_lag2 Total Investment_lag3 Total Investment_SMA_3 Total Investment_SMA_3_lag2 Total Investment_SMA_3_lag3 Total Investment_SMA_5 Total Investment_SMA_5_lag2 Total Investment_SMA_5_lag3 Total Investment_EMA_8 Total Investment_EMA_8_lag2 Total Investment_EMA_8_lag3 Total_Investment_Ad_Stock Total_Investment_Ad_Stock_lag2 Total_Investment_Ad_Stock_lag3 TV TV_lag2 TV_lag3 TV_SMA_3 TV_SMA_3_lag2 TV_SMA_3_lag3 TV_SMA_5 TV_SMA_5_lag2 TV_SMA_5_lag3 TV_EMA_8 TV_EMA_8_lag2 TV_EMA_8_lag3 TV_Ad_Stock TV_Ad_Stock_lag2 TV_Ad_Stock_lag3 Digital Digital_lag2 Digital_lag3 Digital_SMA_3 Digital_SMA_3_lag2 Digital_SMA_3_lag3 Digital_SMA_5 Digital_SMA_5_lag2 Digital_SMA_5_lag3 Digital_EMA_8 Digital_EMA_8_lag2 Digital_EMA_8_lag3 Digital_Ad_Stock Digital_Ad_Stock_lag2 Digital_Ad_Stock_lag3 Sponsorship Sponsorship_lag2 Sponsorship_lag3 Sponsorship_SMA_3 Sponsorship_SMA_3_lag2 Sponsorship_SMA_3_lag3 Sponsorship_SMA_5 Sponsorship_SMA_5_lag2 Sponsorship_SMA_5_lag3 Sponsorship_EMA_8 Sponsorship_EMA_8_lag2 Sponsorship_EMA_8_lag3 Sponsorship_Ad_Stock Sponsorship_Ad_Stock_lag2 Sponsorship_Ad_Stock_lag3 Content Marketing Content Marketing_lag2 Content Marketing_lag3 Content Marketing_SMA_3 Content Marketing_SMA_3_lag2 Content Marketing_SMA_3_lag3 Content Marketing_SMA_5 Content Marketing_SMA_5_lag2 Content Marketing_SMA_5_lag3 Content Marketing_EMA_8 Content Marketing_EMA_8_lag2 Content Marketing_EMA_8_lag3 Content_Marketing_Ad_Stock Content_Marketing_Ad_Stock_lag2 Content_Marketing_Ad_Stock_lag3 Online marketing Online marketing_lag2 Online marketing_lag3 Online marketing_SMA_3 Online marketing_SMA_3_lag2 Online marketing_SMA_3_lag3 Online marketing_SMA_5 Online marketing_SMA_5_lag2 Online marketing_SMA_5_lag3 Online marketing_EMA_8 Online marketing_EMA_8_lag2 Online marketing_EMA_8_lag3 Online_marketing_Ad_Stock Online_marketing_Ad_Stock_lag2 Online_marketing_Ad_Stock_lag3 Affiliates Affiliates_lag2 Affiliates_lag3 Affiliates_SMA_3 Affiliates_SMA_3_lag2 Affiliates_SMA_3_lag3 Affiliates_SMA_5 Affiliates_SMA_5_lag2 Affiliates_SMA_5_lag3 Affiliates_EMA_8 Affiliates_EMA_8_lag2 Affiliates_EMA_8_lag3 Affiliates_Ad_Stock Affiliates_Ad_Stock_lag2 Affiliates_Ad_Stock_lag3 SEM SEM_lag2 SEM_lag3 SEM_SMA_3 SEM_SMA_3_lag2 SEM_SMA_3_lag3 SEM_SMA_5 SEM_SMA_5_lag2 SEM_SMA_5_lag3 SEM_EMA_8 SEM_EMA_8_lag2 SEM_EMA_8_lag3 SEM_Ad_Stock SEM_Ad_Stock_lag2 SEM_Ad_Stock_lag3 Radio Radio_lag2 Radio_lag3 Radio_SMA_3 Radio_SMA_3_lag2 Radio_SMA_3_lag3 Radio_SMA_5 Radio_SMA_5_lag2 Radio_SMA_5_lag3 Radio_EMA_8 Radio_EMA_8_lag2 Radio_EMA_8_lag3 Radio_Ad_Stock Radio_Ad_Stock_lag2 Radio_Ad_Stock_lag3 Other Other_lag2 Other_lag3 Other_SMA_3 Other_SMA_3_lag2 Other_SMA_3_lag3 Other_SMA_5 Other_SMA_5_lag2 Other_SMA_5_lag3 Other_EMA_8 Other_EMA_8_lag2 Other_EMA_8_lag3 Other_Ad_Stock Other_Ad_Stock_lag2 Other_Ad_Stock_lag3 NPS NPS_lag2 NPS_lag3 NPS_SMA_3 NPS_SMA_3_lag2 NPS_SMA_3_lag3 NPS_SMA_5 NPS_SMA_5_lag2 NPS_SMA_5_lag3 Stock Index Stock Index_lag2 Stock Index_lag3 Stock Index_SMA_3 Stock Index_SMA_3_lag2 Stock Index_SMA_3_lag3 Stock Index_SMA_5 Stock Index_SMA_5_lag2 Stock Index_SMA_5_lag3 Max Temp Max Temp_lag2 Max Temp_lag3 Min Temp Min Temp_lag2 Min Temp_lag3 Mean Temp Mean Temp_lag2 Mean Temp_lag3 Heat Deg Days Heat Deg Days_lag2 Heat Deg Days_lag3 Cool Deg Days Cool Deg Days_lag2 Cool Deg Days_lag3 Total Rain (mm) Total Rain (mm)_lag2 Total Rain (mm)_lag3 Total Snow (cm) Total Snow (cm)_lag2 Total Snow (cm)_lag3 Total Precip (mm) Total Precip (mm)_lag2 Total Precip (mm)_lag3 Snow on Grnd (cm) Snow on Grnd (cm)_lag2 Snow on Grnd (cm)_lag3 Sale Sale_lag2 Sale_lag3
25 4573783.133 nan nan 31.451 nan nan 0.000 nan nan 0.000 nan nan 7.369 nan nan 2.863 nan nan 1583 nan nan 1366 nan nan 8 nan nan 33 nan nan 1 nan nan 516.000 nan nan 23 nan nan 1374.000 nan nan 0 nan nan 0 nan nan 0 nan nan 63 nan nan 0 nan nan 0 nan nan 4.265 nan nan 0.000 nan nan 0.000 nan nan 4.265 nan nan 4.265 nan nan 0.054 nan nan 0.000 nan nan 0.000 nan nan 0.054 nan nan 0.054 nan nan 0.633 nan nan 0.000 nan nan 0.000 nan nan 0.633 nan nan 0.633 nan nan 1.854 nan nan 0.000 nan nan 0.000 nan nan 1.854 nan nan 1.854 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 0.332 nan nan 0.000 nan nan 0.000 nan nan 0.332 nan nan 0.332 nan nan 0.137 nan nan 0.000 nan nan 0.000 nan nan 0.137 nan nan 0.137 nan nan 1.256 nan nan 0.000 nan nan 0.000 nan nan 1.256 nan nan 1.256 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 54.600 nan nan 0.000 nan nan 0.000 nan nan 1177.000 nan nan 0.000 nan nan 0.000 nan nan 28.000 nan nan 12.500 nan nan 20.100 nan nan 0.283 nan nan 2.383 nan nan 4.417 nan nan 0.000 nan nan 4.417 nan nan 0.000 nan nan 0 nan nan
26 5371525.000 nan nan 32.967 nan nan 0.000 nan nan 0.000 nan nan 6.985 nan nan 2.746 nan nan 1868 nan nan 1610 nan nan 7 nan nan 50 nan nan 1 nan nan 574.000 nan nan 42 nan nan 1623.000 nan nan 0 nan nan 0 nan nan 0 nan nan 69 nan nan 1 nan nan 0 nan nan 4.265 nan nan 0.000 nan nan 0.000 nan nan 4.265 nan nan 6.824 nan nan 0.054 nan nan 0.000 nan nan 0.000 nan nan 0.054 nan nan 0.086 nan nan 0.633 nan nan 0.000 nan nan 0.000 nan nan 0.633 nan nan 1.013 nan nan 1.854 nan nan 0.000 nan nan 0.000 nan nan 1.854 nan nan 2.966 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 0.332 nan nan 0.000 nan nan 0.000 nan nan 0.332 nan nan 0.531 nan nan 0.137 nan nan 0.000 nan nan 0.000 nan nan 0.137 nan nan 0.219 nan nan 1.256 nan nan 0.000 nan nan 0.000 nan nan 1.256 nan nan 2.010 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 0.000 nan nan 54.600 nan nan 0.000 nan nan 0.000 nan nan 1177.000 nan nan 0.000 nan nan 0.000 nan nan 33.000 nan nan 11.000 nan nan 23.183 nan nan 0.000 nan nan 5.183 nan nan 1.400 nan nan 0.000 nan nan 1.400 nan nan 0.000 nan nan 2 nan nan
27 4679828.000 4573783.133 nan 32.357 31.451 nan 0.000 0.000 nan 0.000 0.000 nan 7.072 7.369 nan 2.861 2.863 nan 1758 1583.000 nan 1569 1366.000 nan 4 8.000 nan 56 33.000 nan 0 1.000 nan 577.000 516.000 nan 36 23.000 nan 1430.000 1374.000 nan 0 0.000 nan 0 0.000 nan 0 0.000 nan 46 63.000 nan 0 0.000 nan 0 0.000 nan 4.265 4.265 nan 4.265 0.000 nan 0.000 0.000 nan 4.265 4.265 nan 8.359 4.265 nan 0.054 0.054 nan 0.054 0.000 nan 0.000 0.000 nan 0.054 0.054 nan 0.106 0.054 nan 0.633 0.633 nan 0.633 0.000 nan 0.000 0.000 nan 0.633 0.633 nan 1.241 0.633 nan 1.854 1.854 nan 1.854 0.000 nan 0.000 0.000 nan 1.854 1.854 nan 3.634 1.854 nan 0.000 0.000 nan 0.000 0.000 nan 0.000 0.000 nan 0.000 0.000 nan 0.000 0.000 nan 0.332 0.332 nan 0.332 0.000 nan 0.000 0.000 nan 0.332 0.332 nan 0.651 0.332 nan 0.137 0.137 nan 0.137 0.000 nan 0.000 0.000 nan 0.137 0.137 nan 0.269 0.137 nan 1.256 1.256 nan 1.256 0.000 nan 0.000 0.000 nan 1.256 1.256 nan 2.462 1.256 nan 0.000 0.000 nan 0.000 0.000 nan 0.000 0.000 nan 0.000 0.000 nan 0.000 0.000 nan 0.000 0.000 nan 0.000 0.000 nan 0.000 0.000 nan 0.000 0.000 nan 0.000 0.000 nan 54.600 54.600 nan 54.600 0.000 nan 0.000 0.000 nan 1177.000 1177.000 nan 1177.000 0.000 nan 0.000 0.000 nan 31.500 28.000 nan 14.500 12.500 nan 23.060 20.100 nan 0.000 0.283 nan 5.060 2.383 nan 1.080 4.417 nan 0.000 0.000 nan 1.080 4.417 nan 0.000 0.000 nan 0 0.000 nan
28 3451151.000 5371525.000 4573783.133 32.208 32.967 31.451 0.000 0.000 0.000 0.000 0.000 0.000 7.201 6.985 7.369 2.735 2.746 2.863 1244 1868.000 1583.000 1072 1610.000 1366.000 2 7.000 8.000 43 50.000 33.000 0 1.000 1.000 420.000 574.000 516.000 20 42.000 23.000 1025.000 1623.000 1374.000 0 0.000 0.000 0 0.000 0.000 0 0.000 0.000 44 69.000 63.000 1 1.000 0.000 0 0.000 0.000 4.265 4.265 4.265 4.265 0.000 0.000 0.000 0.000 0.000 4.265 4.265 4.265 9.281 6.824 4.265 0.054 0.054 0.054 0.054 0.000 0.000 0.000 0.000 0.000 0.054 0.054 0.054 0.118 0.086 0.054 0.633 0.633 0.633 0.633 0.000 0.000 0.000 0.000 0.000 0.633 0.633 0.633 1.377 1.013 0.633 1.854 1.854 1.854 1.854 0.000 0.000 0.000 0.000 0.000 1.854 1.854 1.854 4.034 2.966 1.854 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.332 0.332 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.332 0.722 0.531 0.332 0.137 0.137 0.137 0.137 0.000 0.000 0.000 0.000 0.000 0.137 0.137 0.137 0.298 0.219 0.137 1.256 1.256 1.256 1.256 0.000 0.000 0.000 0.000 0.000 1.256 1.256 1.256 2.733 2.010 1.256 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 54.600 54.600 54.600 0.000 0.000 0.000 0.000 0.000 1177.000 1177.000 1177.000 1177.000 0.000 0.000 0.000 0.000 0.000 33.500 33.000 28.000 16.000 11.000 12.500 24.567 23.183 20.100 0.000 0.000 0.283 6.567 5.183 2.383 4.633 1.400 4.417 0.000 0.000 0.000 4.633 1.400 4.417 0.000 0.000 0.000 0 2.000 0.000
29 2599.000 4679828.000 5371525.000 16.130 32.357 32.967 0.000 0.000 0.000 0.000 0.000 0.000 9.000 7.072 6.985 2.000 2.861 2.746 0 1758.000 1868.000 0 1569.000 1610.000 0 4.000 7.000 0 56.000 50.000 0 0.000 1.000 0.000 577.000 574.000 0 36.000 42.000 1.000 1430.000 1623.000 0 0.000 0.000 0 0.000 0.000 0 0.000 0.000 0 46.000 69.000 0 0.000 1.000 0 0.000 0.000 1.013 4.265 4.265 3.181 4.265 0.000 3.615 0.000 0.000 3.542 4.265 4.265 6.581 8.359 6.824 0.001 0.054 0.054 0.036 0.054 0.000 0.043 0.000 0.000 0.042 0.054 0.054 0.072 0.106 0.086 0.256 0.633 0.633 0.507 0.633 0.000 0.558 0.000 0.000 0.549 0.633 0.633 1.082 1.241 1.013 0.213 1.854 1.854 1.307 1.854 0.000 1.526 0.000 0.000 1.489 1.854 1.854 2.634 3.634 2.966 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.026 0.332 0.332 0.230 0.332 0.000 0.271 0.000 0.000 0.264 0.332 0.332 0.459 0.651 0.531 0.015 0.137 0.137 0.096 0.137 0.000 0.113 0.000 0.000 0.110 0.137 0.137 0.194 0.269 0.219 0.503 1.256 1.256 1.005 1.256 0.000 1.105 0.000 0.000 1.089 1.256 1.256 2.143 2.462 2.010 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 59.987 54.600 54.600 56.395 54.600 0.000 55.677 0.000 0.000 1206.000 1177.000 1177.000 1186.667 1177.000 0.000 1182.800 0.000 0.000 28.500 31.500 33.000 15.000 14.500 11.000 21.650 23.060 23.183 0.000 0.000 0.000 3.650 5.060 5.183 0.350 1.080 1.400 0.000 0.000 0.000 0.350 1.080 1.400 0.000 0.000 0.000 0 0.000 2.000

Creating new lag(Lag of 1 week) variables for the dependent variable (GMV) as well as independent variables

In [378]:
cameraaccessory_dlmul_df = lag_variables(cameraaccessory_dlmul_df,cameraaccessory_dlmul_df_columns,1)
gamingaccessory_dlmul_df = lag_variables(gamingaccessory_dlmul_df,gamingaccessory_dlmul_df_columns,1)
homeaudio_dlmul_df = lag_variables(homeaudio_dlmul_df,homeaudio_dlmul_df_columns,1)
homeaudio_dlmul_df.head()
Out[378]:
gmv gmv_lag1 gmv_lag2 gmv_lag3 Discount% Discount%_lag1 Discount%_lag2 Discount%_lag3 deliverybdays deliverybdays_lag1 deliverybdays_lag2 deliverybdays_lag3 deliverycdays deliverycdays_lag1 deliverycdays_lag2 deliverycdays_lag3 sla sla_lag1 sla_lag2 sla_lag3 product_procurement_sla product_procurement_sla_lag1 product_procurement_sla_lag2 product_procurement_sla_lag3 is_cod is_cod_lag1 is_cod_lag2 is_cod_lag3 is_mass_market is_mass_market_lag1 is_mass_market_lag2 is_mass_market_lag3 product_vertical_djcontroller product_vertical_djcontroller_lag1 product_vertical_djcontroller_lag2 product_vertical_djcontroller_lag3 product_vertical_dock product_vertical_dock_lag1 product_vertical_dock_lag2 product_vertical_dock_lag3 product_vertical_dockingstation product_vertical_dockingstation_lag1 product_vertical_dockingstation_lag2 product_vertical_dockingstation_lag3 product_vertical_fmradio product_vertical_fmradio_lag1 product_vertical_fmradio_lag2 product_vertical_fmradio_lag3 product_vertical_hifisystem product_vertical_hifisystem_lag1 product_vertical_hifisystem_lag2 product_vertical_hifisystem_lag3 product_vertical_homeaudiospeaker product_vertical_homeaudiospeaker_lag1 product_vertical_homeaudiospeaker_lag2 product_vertical_homeaudiospeaker_lag3 product_vertical_karaokeplayer product_vertical_karaokeplayer_lag1 product_vertical_karaokeplayer_lag2 product_vertical_karaokeplayer_lag3 product_vertical_slingbox product_vertical_slingbox_lag1 product_vertical_slingbox_lag2 product_vertical_slingbox_lag3 product_vertical_soundmixer product_vertical_soundmixer_lag1 product_vertical_soundmixer_lag2 product_vertical_soundmixer_lag3 product_vertical_voicerecorder product_vertical_voicerecorder_lag1 product_vertical_voicerecorder_lag2 product_vertical_voicerecorder_lag3 payday_week payday_week_lag1 payday_week_lag2 payday_week_lag3 holiday_week holiday_week_lag1 holiday_week_lag2 holiday_week_lag3 Total Investment Total Investment_lag1 Total Investment_lag2 Total Investment_lag3 Total Investment_SMA_3 Total Investment_SMA_3_lag1 Total Investment_SMA_3_lag2 Total Investment_SMA_3_lag3 Total Investment_SMA_5 Total Investment_SMA_5_lag1 Total Investment_SMA_5_lag2 Total Investment_SMA_5_lag3 Total Investment_EMA_8 Total Investment_EMA_8_lag1 Total Investment_EMA_8_lag2 Total Investment_EMA_8_lag3 Total_Investment_Ad_Stock Total_Investment_Ad_Stock_lag1 Total_Investment_Ad_Stock_lag2 Total_Investment_Ad_Stock_lag3 TV TV_lag1 TV_lag2 TV_lag3 TV_SMA_3 TV_SMA_3_lag1 TV_SMA_3_lag2 TV_SMA_3_lag3 TV_SMA_5 TV_SMA_5_lag1 TV_SMA_5_lag2 TV_SMA_5_lag3 TV_EMA_8 TV_EMA_8_lag1 TV_EMA_8_lag2 TV_EMA_8_lag3 TV_Ad_Stock TV_Ad_Stock_lag1 TV_Ad_Stock_lag2 TV_Ad_Stock_lag3 Digital Digital_lag1 Digital_lag2 Digital_lag3 Digital_SMA_3 Digital_SMA_3_lag1 Digital_SMA_3_lag2 Digital_SMA_3_lag3 Digital_SMA_5 Digital_SMA_5_lag1 Digital_SMA_5_lag2 Digital_SMA_5_lag3 Digital_EMA_8 Digital_EMA_8_lag1 Digital_EMA_8_lag2 Digital_EMA_8_lag3 Digital_Ad_Stock Digital_Ad_Stock_lag1 Digital_Ad_Stock_lag2 Digital_Ad_Stock_lag3 Sponsorship Sponsorship_lag1 Sponsorship_lag2 Sponsorship_lag3 Sponsorship_SMA_3 Sponsorship_SMA_3_lag1 Sponsorship_SMA_3_lag2 Sponsorship_SMA_3_lag3 Sponsorship_SMA_5 Sponsorship_SMA_5_lag1 Sponsorship_SMA_5_lag2 Sponsorship_SMA_5_lag3 Sponsorship_EMA_8 Sponsorship_EMA_8_lag1 Sponsorship_EMA_8_lag2 Sponsorship_EMA_8_lag3 Sponsorship_Ad_Stock Sponsorship_Ad_Stock_lag1 Sponsorship_Ad_Stock_lag2 Sponsorship_Ad_Stock_lag3 Content Marketing Content Marketing_lag1 Content Marketing_lag2 Content Marketing_lag3 Content Marketing_SMA_3 Content Marketing_SMA_3_lag1 Content Marketing_SMA_3_lag2 Content Marketing_SMA_3_lag3 Content Marketing_SMA_5 Content Marketing_SMA_5_lag1 Content Marketing_SMA_5_lag2 Content Marketing_SMA_5_lag3 Content Marketing_EMA_8 Content Marketing_EMA_8_lag1 Content Marketing_EMA_8_lag2 Content Marketing_EMA_8_lag3 Content_Marketing_Ad_Stock Content_Marketing_Ad_Stock_lag1 Content_Marketing_Ad_Stock_lag2 Content_Marketing_Ad_Stock_lag3 Online marketing Online marketing_lag1 Online marketing_lag2 Online marketing_lag3 Online marketing_SMA_3 Online marketing_SMA_3_lag1 Online marketing_SMA_3_lag2 Online marketing_SMA_3_lag3 Online marketing_SMA_5 Online marketing_SMA_5_lag1 Online marketing_SMA_5_lag2 Online marketing_SMA_5_lag3 Online marketing_EMA_8 Online marketing_EMA_8_lag1 Online marketing_EMA_8_lag2 Online marketing_EMA_8_lag3 Online_marketing_Ad_Stock Online_marketing_Ad_Stock_lag1 Online_marketing_Ad_Stock_lag2 Online_marketing_Ad_Stock_lag3 Affiliates Affiliates_lag1 Affiliates_lag2 Affiliates_lag3 Affiliates_SMA_3 Affiliates_SMA_3_lag1 Affiliates_SMA_3_lag2 Affiliates_SMA_3_lag3 Affiliates_SMA_5 Affiliates_SMA_5_lag1 Affiliates_SMA_5_lag2 Affiliates_SMA_5_lag3 Affiliates_EMA_8 Affiliates_EMA_8_lag1 Affiliates_EMA_8_lag2 Affiliates_EMA_8_lag3 Affiliates_Ad_Stock Affiliates_Ad_Stock_lag1 Affiliates_Ad_Stock_lag2 Affiliates_Ad_Stock_lag3 SEM SEM_lag1 SEM_lag2 SEM_lag3 SEM_SMA_3 SEM_SMA_3_lag1 SEM_SMA_3_lag2 SEM_SMA_3_lag3 SEM_SMA_5 SEM_SMA_5_lag1 SEM_SMA_5_lag2 SEM_SMA_5_lag3 SEM_EMA_8 SEM_EMA_8_lag1 SEM_EMA_8_lag2 SEM_EMA_8_lag3 SEM_Ad_Stock SEM_Ad_Stock_lag1 SEM_Ad_Stock_lag2 SEM_Ad_Stock_lag3 Radio Radio_lag1 Radio_lag2 Radio_lag3 Radio_SMA_3 Radio_SMA_3_lag1 Radio_SMA_3_lag2 Radio_SMA_3_lag3 Radio_SMA_5 Radio_SMA_5_lag1 Radio_SMA_5_lag2 Radio_SMA_5_lag3 Radio_EMA_8 Radio_EMA_8_lag1 Radio_EMA_8_lag2 Radio_EMA_8_lag3 Radio_Ad_Stock Radio_Ad_Stock_lag1 Radio_Ad_Stock_lag2 Radio_Ad_Stock_lag3 Other Other_lag1 Other_lag2 Other_lag3 Other_SMA_3 Other_SMA_3_lag1 Other_SMA_3_lag2 Other_SMA_3_lag3 Other_SMA_5 Other_SMA_5_lag1 Other_SMA_5_lag2 Other_SMA_5_lag3 Other_EMA_8 Other_EMA_8_lag1 Other_EMA_8_lag2 Other_EMA_8_lag3 Other_Ad_Stock Other_Ad_Stock_lag1 Other_Ad_Stock_lag2 Other_Ad_Stock_lag3 NPS NPS_lag1 NPS_lag2 NPS_lag3 NPS_SMA_3 NPS_SMA_3_lag1 NPS_SMA_3_lag2 NPS_SMA_3_lag3 NPS_SMA_5 NPS_SMA_5_lag1 NPS_SMA_5_lag2 NPS_SMA_5_lag3 Stock Index Stock Index_lag1 Stock Index_lag2 Stock Index_lag3 Stock Index_SMA_3 Stock Index_SMA_3_lag1 Stock Index_SMA_3_lag2 Stock Index_SMA_3_lag3 Stock Index_SMA_5 Stock Index_SMA_5_lag1 Stock Index_SMA_5_lag2 Stock Index_SMA_5_lag3 Max Temp Max Temp_lag1 Max Temp_lag2 Max Temp_lag3 Min Temp Min Temp_lag1 Min Temp_lag2 Min Temp_lag3 Mean Temp Mean Temp_lag1 Mean Temp_lag2 Mean Temp_lag3 Heat Deg Days Heat Deg Days_lag1 Heat Deg Days_lag2 Heat Deg Days_lag3 Cool Deg Days Cool Deg Days_lag1 Cool Deg Days_lag2 Cool Deg Days_lag3 Total Rain (mm) Total Rain (mm)_lag1 Total Rain (mm)_lag2 Total Rain (mm)_lag3 Total Snow (cm) Total Snow (cm)_lag1 Total Snow (cm)_lag2 Total Snow (cm)_lag3 Total Precip (mm) Total Precip (mm)_lag1 Total Precip (mm)_lag2 Total Precip (mm)_lag3 Snow on Grnd (cm) Snow on Grnd (cm)_lag1 Snow on Grnd (cm)_lag2 Snow on Grnd (cm)_lag3 Sale Sale_lag1 Sale_lag2 Sale_lag3
25 4573783.133 nan nan nan 31.451 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 7.369 nan nan nan 2.863 nan nan nan 1583 nan nan nan 1366 nan nan nan 8 nan nan nan 33 nan nan nan 1 nan nan nan 516.000 nan nan nan 23 nan nan nan 1374.000 nan nan nan 0 nan nan nan 0 nan nan nan 0 nan nan nan 63 nan nan nan 0 nan nan nan 0 nan nan nan 4.265 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 4.265 nan nan nan 4.265 nan nan nan 0.054 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 0.054 nan nan nan 0.054 nan nan nan 0.633 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 0.633 nan nan nan 0.633 nan nan nan 1.854 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 1.854 nan nan nan 1.854 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 0.332 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 0.332 nan nan nan 0.332 nan nan nan 0.137 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 0.137 nan nan nan 0.137 nan nan nan 1.256 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 1.256 nan nan nan 1.256 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 54.600 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 1177.000 nan nan nan 0.000 nan nan nan 0.000 nan nan nan 28.000 nan nan nan 12.500 nan nan nan 20.100 nan nan nan 0.283 nan nan nan 2.383 nan nan nan 4.417 nan nan nan 0.000 nan nan nan 4.417 nan nan nan 0.000 nan nan nan 0 nan nan nan
26 5371525.000 4573783.133 nan nan 32.967 31.451 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 6.985 7.369 nan nan 2.746 2.863 nan nan 1868 1583.000 nan nan 1610 1366.000 nan nan 7 8.000 nan nan 50 33.000 nan nan 1 1.000 nan nan 574.000 516.000 nan nan 42 23.000 nan nan 1623.000 1374.000 nan nan 0 0.000 nan nan 0 0.000 nan nan 0 0.000 nan nan 69 63.000 nan nan 1 0.000 nan nan 0 0.000 nan nan 4.265 4.265 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 4.265 4.265 nan nan 6.824 4.265 nan nan 0.054 0.054 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 0.054 0.054 nan nan 0.086 0.054 nan nan 0.633 0.633 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 0.633 0.633 nan nan 1.013 0.633 nan nan 1.854 1.854 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 1.854 1.854 nan nan 2.966 1.854 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 0.332 0.332 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 0.332 0.332 nan nan 0.531 0.332 nan nan 0.137 0.137 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 0.137 0.137 nan nan 0.219 0.137 nan nan 1.256 1.256 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 1.256 1.256 nan nan 2.010 1.256 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 54.600 54.600 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 1177.000 1177.000 nan nan 0.000 0.000 nan nan 0.000 0.000 nan nan 33.000 28.000 nan nan 11.000 12.500 nan nan 23.183 20.100 nan nan 0.000 0.283 nan nan 5.183 2.383 nan nan 1.400 4.417 nan nan 0.000 0.000 nan nan 1.400 4.417 nan nan 0.000 0.000 nan nan 2 0.000 nan nan
27 4679828.000 5371525.000 4573783.133 nan 32.357 32.967 31.451 nan 0.000 0.000 0.000 nan 0.000 0.000 0.000 nan 7.072 6.985 7.369 nan 2.861 2.746 2.863 nan 1758 1868.000 1583.000 nan 1569 1610.000 1366.000 nan 4 7.000 8.000 nan 56 50.000 33.000 nan 0 1.000 1.000 nan 577.000 574.000 516.000 nan 36 42.000 23.000 nan 1430.000 1623.000 1374.000 nan 0 0.000 0.000 nan 0 0.000 0.000 nan 0 0.000 0.000 nan 46 69.000 63.000 nan 0 1.000 0.000 nan 0 0.000 0.000 nan 4.265 4.265 4.265 nan 4.265 0.000 0.000 nan 0.000 0.000 0.000 nan 4.265 4.265 4.265 nan 8.359 6.824 4.265 nan 0.054 0.054 0.054 nan 0.054 0.000 0.000 nan 0.000 0.000 0.000 nan 0.054 0.054 0.054 nan 0.106 0.086 0.054 nan 0.633 0.633 0.633 nan 0.633 0.000 0.000 nan 0.000 0.000 0.000 nan 0.633 0.633 0.633 nan 1.241 1.013 0.633 nan 1.854 1.854 1.854 nan 1.854 0.000 0.000 nan 0.000 0.000 0.000 nan 1.854 1.854 1.854 nan 3.634 2.966 1.854 nan 0.000 0.000 0.000 nan 0.000 0.000 0.000 nan 0.000 0.000 0.000 nan 0.000 0.000 0.000 nan 0.000 0.000 0.000 nan 0.332 0.332 0.332 nan 0.332 0.000 0.000 nan 0.000 0.000 0.000 nan 0.332 0.332 0.332 nan 0.651 0.531 0.332 nan 0.137 0.137 0.137 nan 0.137 0.000 0.000 nan 0.000 0.000 0.000 nan 0.137 0.137 0.137 nan 0.269 0.219 0.137 nan 1.256 1.256 1.256 nan 1.256 0.000 0.000 nan 0.000 0.000 0.000 nan 1.256 1.256 1.256 nan 2.462 2.010 1.256 nan 0.000 0.000 0.000 nan 0.000 0.000 0.000 nan 0.000 0.000 0.000 nan 0.000 0.000 0.000 nan 0.000 0.000 0.000 nan 0.000 0.000 0.000 nan 0.000 0.000 0.000 nan 0.000 0.000 0.000 nan 0.000 0.000 0.000 nan 0.000 0.000 0.000 nan 54.600 54.600 54.600 nan 54.600 0.000 0.000 nan 0.000 0.000 0.000 nan 1177.000 1177.000 1177.000 nan 1177.000 0.000 0.000 nan 0.000 0.000 0.000 nan 31.500 33.000 28.000 nan 14.500 11.000 12.500 nan 23.060 23.183 20.100 nan 0.000 0.000 0.283 nan 5.060 5.183 2.383 nan 1.080 1.400 4.417 nan 0.000 0.000 0.000 nan 1.080 1.400 4.417 nan 0.000 0.000 0.000 nan 0 2.000 0.000 nan
28 3451151.000 4679828.000 5371525.000 4573783.133 32.208 32.357 32.967 31.451 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 7.201 7.072 6.985 7.369 2.735 2.861 2.746 2.863 1244 1758.000 1868.000 1583.000 1072 1569.000 1610.000 1366.000 2 4.000 7.000 8.000 43 56.000 50.000 33.000 0 0.000 1.000 1.000 420.000 577.000 574.000 516.000 20 36.000 42.000 23.000 1025.000 1430.000 1623.000 1374.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 44 46.000 69.000 63.000 1 0.000 1.000 0.000 0 0.000 0.000 0.000 4.265 4.265 4.265 4.265 4.265 4.265 0.000 0.000 0.000 0.000 0.000 0.000 4.265 4.265 4.265 4.265 9.281 8.359 6.824 4.265 0.054 0.054 0.054 0.054 0.054 0.054 0.000 0.000 0.000 0.000 0.000 0.000 0.054 0.054 0.054 0.054 0.118 0.106 0.086 0.054 0.633 0.633 0.633 0.633 0.633 0.633 0.000 0.000 0.000 0.000 0.000 0.000 0.633 0.633 0.633 0.633 1.377 1.241 1.013 0.633 1.854 1.854 1.854 1.854 1.854 1.854 0.000 0.000 0.000 0.000 0.000 0.000 1.854 1.854 1.854 1.854 4.034 3.634 2.966 1.854 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.332 0.332 0.332 0.332 0.000 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.332 0.332 0.722 0.651 0.531 0.332 0.137 0.137 0.137 0.137 0.137 0.137 0.000 0.000 0.000 0.000 0.000 0.000 0.137 0.137 0.137 0.137 0.298 0.269 0.219 0.137 1.256 1.256 1.256 1.256 1.256 1.256 0.000 0.000 0.000 0.000 0.000 0.000 1.256 1.256 1.256 1.256 2.733 2.462 2.010 1.256 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 54.600 54.600 54.600 54.600 54.600 0.000 0.000 0.000 0.000 0.000 0.000 1177.000 1177.000 1177.000 1177.000 1177.000 1177.000 0.000 0.000 0.000 0.000 0.000 0.000 33.500 31.500 33.000 28.000 16.000 14.500 11.000 12.500 24.567 23.060 23.183 20.100 0.000 0.000 0.000 0.283 6.567 5.060 5.183 2.383 4.633 1.080 1.400 4.417 0.000 0.000 0.000 0.000 4.633 1.080 1.400 4.417 0.000 0.000 0.000 0.000 0 0.000 2.000 0.000
29 2599.000 3451151.000 4679828.000 5371525.000 16.130 32.208 32.357 32.967 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 9.000 7.201 7.072 6.985 2.000 2.735 2.861 2.746 0 1244.000 1758.000 1868.000 0 1072.000 1569.000 1610.000 0 2.000 4.000 7.000 0 43.000 56.000 50.000 0 0.000 0.000 1.000 0.000 420.000 577.000 574.000 0 20.000 36.000 42.000 1.000 1025.000 1430.000 1623.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 0 44.000 46.000 69.000 0 1.000 0.000 1.000 0 0.000 0.000 0.000 1.013 4.265 4.265 4.265 3.181 4.265 4.265 0.000 3.615 0.000 0.000 0.000 3.542 4.265 4.265 4.265 6.581 9.281 8.359 6.824 0.001 0.054 0.054 0.054 0.036 0.054 0.054 0.000 0.043 0.000 0.000 0.000 0.042 0.054 0.054 0.054 0.072 0.118 0.106 0.086 0.256 0.633 0.633 0.633 0.507 0.633 0.633 0.000 0.558 0.000 0.000 0.000 0.549 0.633 0.633 0.633 1.082 1.377 1.241 1.013 0.213 1.854 1.854 1.854 1.307 1.854 1.854 0.000 1.526 0.000 0.000 0.000 1.489 1.854 1.854 1.854 2.634 4.034 3.634 2.966 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.026 0.332 0.332 0.332 0.230 0.332 0.332 0.000 0.271 0.000 0.000 0.000 0.264 0.332 0.332 0.332 0.459 0.722 0.651 0.531 0.015 0.137 0.137 0.137 0.096 0.137 0.137 0.000 0.113 0.000 0.000 0.000 0.110 0.137 0.137 0.137 0.194 0.298 0.269 0.219 0.503 1.256 1.256 1.256 1.005 1.256 1.256 0.000 1.105 0.000 0.000 0.000 1.089 1.256 1.256 1.256 2.143 2.733 2.462 2.010 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 59.987 54.600 54.600 54.600 56.395 54.600 54.600 0.000 55.677 0.000 0.000 0.000 1206.000 1177.000 1177.000 1177.000 1186.667 1177.000 1177.000 0.000 1182.800 0.000 0.000 0.000 28.500 33.500 31.500 33.000 15.000 16.000 14.500 11.000 21.650 24.567 23.060 23.183 0.000 0.000 0.000 0.000 3.650 6.567 5.060 5.183 0.350 4.633 1.080 1.400 0.000 0.000 0.000 0.000 0.350 4.633 1.080 1.400 0.000 0.000 0.000 0.000 0 0.000 0.000 2.000

Imputing all null values with 0

In [379]:
# Imputing all null values with 0
cameraaccessory_dlmul_df.fillna(value=0, inplace=True)
gamingaccessory_dlmul_df.fillna(value=0, inplace=True)
homeaudio_dlmul_df.fillna(value=0, inplace=True)
homeaudio_dlmul_df.head(10)
Out[379]:
gmv gmv_lag1 gmv_lag2 gmv_lag3 Discount% Discount%_lag1 Discount%_lag2 Discount%_lag3 deliverybdays deliverybdays_lag1 deliverybdays_lag2 deliverybdays_lag3 deliverycdays deliverycdays_lag1 deliverycdays_lag2 deliverycdays_lag3 sla sla_lag1 sla_lag2 sla_lag3 product_procurement_sla product_procurement_sla_lag1 product_procurement_sla_lag2 product_procurement_sla_lag3 is_cod is_cod_lag1 is_cod_lag2 is_cod_lag3 is_mass_market is_mass_market_lag1 is_mass_market_lag2 is_mass_market_lag3 product_vertical_djcontroller product_vertical_djcontroller_lag1 product_vertical_djcontroller_lag2 product_vertical_djcontroller_lag3 product_vertical_dock product_vertical_dock_lag1 product_vertical_dock_lag2 product_vertical_dock_lag3 product_vertical_dockingstation product_vertical_dockingstation_lag1 product_vertical_dockingstation_lag2 product_vertical_dockingstation_lag3 product_vertical_fmradio product_vertical_fmradio_lag1 product_vertical_fmradio_lag2 product_vertical_fmradio_lag3 product_vertical_hifisystem product_vertical_hifisystem_lag1 product_vertical_hifisystem_lag2 product_vertical_hifisystem_lag3 product_vertical_homeaudiospeaker product_vertical_homeaudiospeaker_lag1 product_vertical_homeaudiospeaker_lag2 product_vertical_homeaudiospeaker_lag3 product_vertical_karaokeplayer product_vertical_karaokeplayer_lag1 product_vertical_karaokeplayer_lag2 product_vertical_karaokeplayer_lag3 product_vertical_slingbox product_vertical_slingbox_lag1 product_vertical_slingbox_lag2 product_vertical_slingbox_lag3 product_vertical_soundmixer product_vertical_soundmixer_lag1 product_vertical_soundmixer_lag2 product_vertical_soundmixer_lag3 product_vertical_voicerecorder product_vertical_voicerecorder_lag1 product_vertical_voicerecorder_lag2 product_vertical_voicerecorder_lag3 payday_week payday_week_lag1 payday_week_lag2 payday_week_lag3 holiday_week holiday_week_lag1 holiday_week_lag2 holiday_week_lag3 Total Investment Total Investment_lag1 Total Investment_lag2 Total Investment_lag3 Total Investment_SMA_3 Total Investment_SMA_3_lag1 Total Investment_SMA_3_lag2 Total Investment_SMA_3_lag3 Total Investment_SMA_5 Total Investment_SMA_5_lag1 Total Investment_SMA_5_lag2 Total Investment_SMA_5_lag3 Total Investment_EMA_8 Total Investment_EMA_8_lag1 Total Investment_EMA_8_lag2 Total Investment_EMA_8_lag3 Total_Investment_Ad_Stock Total_Investment_Ad_Stock_lag1 Total_Investment_Ad_Stock_lag2 Total_Investment_Ad_Stock_lag3 TV TV_lag1 TV_lag2 TV_lag3 TV_SMA_3 TV_SMA_3_lag1 TV_SMA_3_lag2 TV_SMA_3_lag3 TV_SMA_5 TV_SMA_5_lag1 TV_SMA_5_lag2 TV_SMA_5_lag3 TV_EMA_8 TV_EMA_8_lag1 TV_EMA_8_lag2 TV_EMA_8_lag3 TV_Ad_Stock TV_Ad_Stock_lag1 TV_Ad_Stock_lag2 TV_Ad_Stock_lag3 Digital Digital_lag1 Digital_lag2 Digital_lag3 Digital_SMA_3 Digital_SMA_3_lag1 Digital_SMA_3_lag2 Digital_SMA_3_lag3 Digital_SMA_5 Digital_SMA_5_lag1 Digital_SMA_5_lag2 Digital_SMA_5_lag3 Digital_EMA_8 Digital_EMA_8_lag1 Digital_EMA_8_lag2 Digital_EMA_8_lag3 Digital_Ad_Stock Digital_Ad_Stock_lag1 Digital_Ad_Stock_lag2 Digital_Ad_Stock_lag3 Sponsorship Sponsorship_lag1 Sponsorship_lag2 Sponsorship_lag3 Sponsorship_SMA_3 Sponsorship_SMA_3_lag1 Sponsorship_SMA_3_lag2 Sponsorship_SMA_3_lag3 Sponsorship_SMA_5 Sponsorship_SMA_5_lag1 Sponsorship_SMA_5_lag2 Sponsorship_SMA_5_lag3 Sponsorship_EMA_8 Sponsorship_EMA_8_lag1 Sponsorship_EMA_8_lag2 Sponsorship_EMA_8_lag3 Sponsorship_Ad_Stock Sponsorship_Ad_Stock_lag1 Sponsorship_Ad_Stock_lag2 Sponsorship_Ad_Stock_lag3 Content Marketing Content Marketing_lag1 Content Marketing_lag2 Content Marketing_lag3 Content Marketing_SMA_3 Content Marketing_SMA_3_lag1 Content Marketing_SMA_3_lag2 Content Marketing_SMA_3_lag3 Content Marketing_SMA_5 Content Marketing_SMA_5_lag1 Content Marketing_SMA_5_lag2 Content Marketing_SMA_5_lag3 Content Marketing_EMA_8 Content Marketing_EMA_8_lag1 Content Marketing_EMA_8_lag2 Content Marketing_EMA_8_lag3 Content_Marketing_Ad_Stock Content_Marketing_Ad_Stock_lag1 Content_Marketing_Ad_Stock_lag2 Content_Marketing_Ad_Stock_lag3 Online marketing Online marketing_lag1 Online marketing_lag2 Online marketing_lag3 Online marketing_SMA_3 Online marketing_SMA_3_lag1 Online marketing_SMA_3_lag2 Online marketing_SMA_3_lag3 Online marketing_SMA_5 Online marketing_SMA_5_lag1 Online marketing_SMA_5_lag2 Online marketing_SMA_5_lag3 Online marketing_EMA_8 Online marketing_EMA_8_lag1 Online marketing_EMA_8_lag2 Online marketing_EMA_8_lag3 Online_marketing_Ad_Stock Online_marketing_Ad_Stock_lag1 Online_marketing_Ad_Stock_lag2 Online_marketing_Ad_Stock_lag3 Affiliates Affiliates_lag1 Affiliates_lag2 Affiliates_lag3 Affiliates_SMA_3 Affiliates_SMA_3_lag1 Affiliates_SMA_3_lag2 Affiliates_SMA_3_lag3 Affiliates_SMA_5 Affiliates_SMA_5_lag1 Affiliates_SMA_5_lag2 Affiliates_SMA_5_lag3 Affiliates_EMA_8 Affiliates_EMA_8_lag1 Affiliates_EMA_8_lag2 Affiliates_EMA_8_lag3 Affiliates_Ad_Stock Affiliates_Ad_Stock_lag1 Affiliates_Ad_Stock_lag2 Affiliates_Ad_Stock_lag3 SEM SEM_lag1 SEM_lag2 SEM_lag3 SEM_SMA_3 SEM_SMA_3_lag1 SEM_SMA_3_lag2 SEM_SMA_3_lag3 SEM_SMA_5 SEM_SMA_5_lag1 SEM_SMA_5_lag2 SEM_SMA_5_lag3 SEM_EMA_8 SEM_EMA_8_lag1 SEM_EMA_8_lag2 SEM_EMA_8_lag3 SEM_Ad_Stock SEM_Ad_Stock_lag1 SEM_Ad_Stock_lag2 SEM_Ad_Stock_lag3 Radio Radio_lag1 Radio_lag2 Radio_lag3 Radio_SMA_3 Radio_SMA_3_lag1 Radio_SMA_3_lag2 Radio_SMA_3_lag3 Radio_SMA_5 Radio_SMA_5_lag1 Radio_SMA_5_lag2 Radio_SMA_5_lag3 Radio_EMA_8 Radio_EMA_8_lag1 Radio_EMA_8_lag2 Radio_EMA_8_lag3 Radio_Ad_Stock Radio_Ad_Stock_lag1 Radio_Ad_Stock_lag2 Radio_Ad_Stock_lag3 Other Other_lag1 Other_lag2 Other_lag3 Other_SMA_3 Other_SMA_3_lag1 Other_SMA_3_lag2 Other_SMA_3_lag3 Other_SMA_5 Other_SMA_5_lag1 Other_SMA_5_lag2 Other_SMA_5_lag3 Other_EMA_8 Other_EMA_8_lag1 Other_EMA_8_lag2 Other_EMA_8_lag3 Other_Ad_Stock Other_Ad_Stock_lag1 Other_Ad_Stock_lag2 Other_Ad_Stock_lag3 NPS NPS_lag1 NPS_lag2 NPS_lag3 NPS_SMA_3 NPS_SMA_3_lag1 NPS_SMA_3_lag2 NPS_SMA_3_lag3 NPS_SMA_5 NPS_SMA_5_lag1 NPS_SMA_5_lag2 NPS_SMA_5_lag3 Stock Index Stock Index_lag1 Stock Index_lag2 Stock Index_lag3 Stock Index_SMA_3 Stock Index_SMA_3_lag1 Stock Index_SMA_3_lag2 Stock Index_SMA_3_lag3 Stock Index_SMA_5 Stock Index_SMA_5_lag1 Stock Index_SMA_5_lag2 Stock Index_SMA_5_lag3 Max Temp Max Temp_lag1 Max Temp_lag2 Max Temp_lag3 Min Temp Min Temp_lag1 Min Temp_lag2 Min Temp_lag3 Mean Temp Mean Temp_lag1 Mean Temp_lag2 Mean Temp_lag3 Heat Deg Days Heat Deg Days_lag1 Heat Deg Days_lag2 Heat Deg Days_lag3 Cool Deg Days Cool Deg Days_lag1 Cool Deg Days_lag2 Cool Deg Days_lag3 Total Rain (mm) Total Rain (mm)_lag1 Total Rain (mm)_lag2 Total Rain (mm)_lag3 Total Snow (cm) Total Snow (cm)_lag1 Total Snow (cm)_lag2 Total Snow (cm)_lag3 Total Precip (mm) Total Precip (mm)_lag1 Total Precip (mm)_lag2 Total Precip (mm)_lag3 Snow on Grnd (cm) Snow on Grnd (cm)_lag1 Snow on Grnd (cm)_lag2 Snow on Grnd (cm)_lag3 Sale Sale_lag1 Sale_lag2 Sale_lag3
25 4573783.133 0.000 0.000 0.000 31.451 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 7.369 0.000 0.000 0.000 2.863 0.000 0.000 0.000 1583 0.000 0.000 0.000 1366 0.000 0.000 0.000 8 0.000 0.000 0.000 33 0.000 0.000 0.000 1 0.000 0.000 0.000 516.000 0.000 0.000 0.000 23 0.000 0.000 0.000 1374.000 0.000 0.000 0.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 63 0.000 0.000 0.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 4.265 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 4.265 0.000 0.000 0.000 4.265 0.000 0.000 0.000 0.054 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.054 0.000 0.000 0.000 0.054 0.000 0.000 0.000 0.633 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.633 0.000 0.000 0.000 0.633 0.000 0.000 0.000 1.854 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.854 0.000 0.000 0.000 1.854 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.332 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.332 0.000 0.000 0.000 0.332 0.000 0.000 0.000 0.137 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.137 0.000 0.000 0.000 0.137 0.000 0.000 0.000 1.256 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.256 0.000 0.000 0.000 1.256 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1177.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 28.000 0.000 0.000 0.000 12.500 0.000 0.000 0.000 20.100 0.000 0.000 0.000 0.283 0.000 0.000 0.000 2.383 0.000 0.000 0.000 4.417 0.000 0.000 0.000 0.000 0.000 0.000 0.000 4.417 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0 0.000 0.000 0.000
26 5371525.000 4573783.133 0.000 0.000 32.967 31.451 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 6.985 7.369 0.000 0.000 2.746 2.863 0.000 0.000 1868 1583.000 0.000 0.000 1610 1366.000 0.000 0.000 7 8.000 0.000 0.000 50 33.000 0.000 0.000 1 1.000 0.000 0.000 574.000 516.000 0.000 0.000 42 23.000 0.000 0.000 1623.000 1374.000 0.000 0.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 69 63.000 0.000 0.000 1 0.000 0.000 0.000 0 0.000 0.000 0.000 4.265 4.265 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 4.265 4.265 0.000 0.000 6.824 4.265 0.000 0.000 0.054 0.054 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.054 0.054 0.000 0.000 0.086 0.054 0.000 0.000 0.633 0.633 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.633 0.633 0.000 0.000 1.013 0.633 0.000 0.000 1.854 1.854 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.854 1.854 0.000 0.000 2.966 1.854 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.000 0.000 0.531 0.332 0.000 0.000 0.137 0.137 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.137 0.137 0.000 0.000 0.219 0.137 0.000 0.000 1.256 1.256 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.256 1.256 0.000 0.000 2.010 1.256 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 54.600 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1177.000 1177.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 33.000 28.000 0.000 0.000 11.000 12.500 0.000 0.000 23.183 20.100 0.000 0.000 0.000 0.283 0.000 0.000 5.183 2.383 0.000 0.000 1.400 4.417 0.000 0.000 0.000 0.000 0.000 0.000 1.400 4.417 0.000 0.000 0.000 0.000 0.000 0.000 2 0.000 0.000 0.000
27 4679828.000 5371525.000 4573783.133 0.000 32.357 32.967 31.451 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 7.072 6.985 7.369 0.000 2.861 2.746 2.863 0.000 1758 1868.000 1583.000 0.000 1569 1610.000 1366.000 0.000 4 7.000 8.000 0.000 56 50.000 33.000 0.000 0 1.000 1.000 0.000 577.000 574.000 516.000 0.000 36 42.000 23.000 0.000 1430.000 1623.000 1374.000 0.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 46 69.000 63.000 0.000 0 1.000 0.000 0.000 0 0.000 0.000 0.000 4.265 4.265 4.265 0.000 4.265 0.000 0.000 0.000 0.000 0.000 0.000 0.000 4.265 4.265 4.265 0.000 8.359 6.824 4.265 0.000 0.054 0.054 0.054 0.000 0.054 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.054 0.054 0.054 0.000 0.106 0.086 0.054 0.000 0.633 0.633 0.633 0.000 0.633 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.633 0.633 0.633 0.000 1.241 1.013 0.633 0.000 1.854 1.854 1.854 0.000 1.854 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.854 1.854 1.854 0.000 3.634 2.966 1.854 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.332 0.000 0.332 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.332 0.000 0.651 0.531 0.332 0.000 0.137 0.137 0.137 0.000 0.137 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.137 0.137 0.137 0.000 0.269 0.219 0.137 0.000 1.256 1.256 1.256 0.000 1.256 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.256 1.256 1.256 0.000 2.462 2.010 1.256 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 54.600 54.600 0.000 54.600 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1177.000 1177.000 1177.000 0.000 1177.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 31.500 33.000 28.000 0.000 14.500 11.000 12.500 0.000 23.060 23.183 20.100 0.000 0.000 0.000 0.283 0.000 5.060 5.183 2.383 0.000 1.080 1.400 4.417 0.000 0.000 0.000 0.000 0.000 1.080 1.400 4.417 0.000 0.000 0.000 0.000 0.000 0 2.000 0.000 0.000
28 3451151.000 4679828.000 5371525.000 4573783.133 32.208 32.357 32.967 31.451 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 7.201 7.072 6.985 7.369 2.735 2.861 2.746 2.863 1244 1758.000 1868.000 1583.000 1072 1569.000 1610.000 1366.000 2 4.000 7.000 8.000 43 56.000 50.000 33.000 0 0.000 1.000 1.000 420.000 577.000 574.000 516.000 20 36.000 42.000 23.000 1025.000 1430.000 1623.000 1374.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 44 46.000 69.000 63.000 1 0.000 1.000 0.000 0 0.000 0.000 0.000 4.265 4.265 4.265 4.265 4.265 4.265 0.000 0.000 0.000 0.000 0.000 0.000 4.265 4.265 4.265 4.265 9.281 8.359 6.824 4.265 0.054 0.054 0.054 0.054 0.054 0.054 0.000 0.000 0.000 0.000 0.000 0.000 0.054 0.054 0.054 0.054 0.118 0.106 0.086 0.054 0.633 0.633 0.633 0.633 0.633 0.633 0.000 0.000 0.000 0.000 0.000 0.000 0.633 0.633 0.633 0.633 1.377 1.241 1.013 0.633 1.854 1.854 1.854 1.854 1.854 1.854 0.000 0.000 0.000 0.000 0.000 0.000 1.854 1.854 1.854 1.854 4.034 3.634 2.966 1.854 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.332 0.332 0.332 0.332 0.000 0.000 0.000 0.000 0.000 0.000 0.332 0.332 0.332 0.332 0.722 0.651 0.531 0.332 0.137 0.137 0.137 0.137 0.137 0.137 0.000 0.000 0.000 0.000 0.000 0.000 0.137 0.137 0.137 0.137 0.298 0.269 0.219 0.137 1.256 1.256 1.256 1.256 1.256 1.256 0.000 0.000 0.000 0.000 0.000 0.000 1.256 1.256 1.256 1.256 2.733 2.462 2.010 1.256 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 54.600 54.600 54.600 54.600 54.600 54.600 0.000 0.000 0.000 0.000 0.000 0.000 1177.000 1177.000 1177.000 1177.000 1177.000 1177.000 0.000 0.000 0.000 0.000 0.000 0.000 33.500 31.500 33.000 28.000 16.000 14.500 11.000 12.500 24.567 23.060 23.183 20.100 0.000 0.000 0.000 0.283 6.567 5.060 5.183 2.383 4.633 1.080 1.400 4.417 0.000 0.000 0.000 0.000 4.633 1.080 1.400 4.417 0.000 0.000 0.000 0.000 0 0.000 2.000 0.000
29 2599.000 3451151.000 4679828.000 5371525.000 16.130 32.208 32.357 32.967 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 9.000 7.201 7.072 6.985 2.000 2.735 2.861 2.746 0 1244.000 1758.000 1868.000 0 1072.000 1569.000 1610.000 0 2.000 4.000 7.000 0 43.000 56.000 50.000 0 0.000 0.000 1.000 0.000 420.000 577.000 574.000 0 20.000 36.000 42.000 1.000 1025.000 1430.000 1623.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 0 44.000 46.000 69.000 0 1.000 0.000 1.000 0 0.000 0.000 0.000 1.013 4.265 4.265 4.265 3.181 4.265 4.265 0.000 3.615 0.000 0.000 0.000 3.542 4.265 4.265 4.265 6.581 9.281 8.359 6.824 0.001 0.054 0.054 0.054 0.036 0.054 0.054 0.000 0.043 0.000 0.000 0.000 0.042 0.054 0.054 0.054 0.072 0.118 0.106 0.086 0.256 0.633 0.633 0.633 0.507 0.633 0.633 0.000 0.558 0.000 0.000 0.000 0.549 0.633 0.633 0.633 1.082 1.377 1.241 1.013 0.213 1.854 1.854 1.854 1.307 1.854 1.854 0.000 1.526 0.000 0.000 0.000 1.489 1.854 1.854 1.854 2.634 4.034 3.634 2.966 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.026 0.332 0.332 0.332 0.230 0.332 0.332 0.000 0.271 0.000 0.000 0.000 0.264 0.332 0.332 0.332 0.459 0.722 0.651 0.531 0.015 0.137 0.137 0.137 0.096 0.137 0.137 0.000 0.113 0.000 0.000 0.000 0.110 0.137 0.137 0.137 0.194 0.298 0.269 0.219 0.503 1.256 1.256 1.256 1.005 1.256 1.256 0.000 1.105 0.000 0.000 0.000 1.089 1.256 1.256 1.256 2.143 2.733 2.462 2.010 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 59.987 54.600 54.600 54.600 56.395 54.600 54.600 0.000 55.677 0.000 0.000 0.000 1206.000 1177.000 1177.000 1177.000 1186.667 1177.000 1177.000 0.000 1182.800 0.000 0.000 0.000 28.500 33.500 31.500 33.000 15.000 16.000 14.500 11.000 21.650 24.567 23.060 23.183 0.000 0.000 0.000 0.000 3.650 6.567 5.060 5.183 0.350 4.633 1.080 1.400 0.000 0.000 0.000 0.000 0.350 4.633 1.080 1.400 0.000 0.000 0.000 0.000 0 0.000 0.000 2.000
30 3875305.000 2599.000 3451151.000 4679828.000 35.972 16.130 32.208 32.357 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 5.599 9.000 7.201 7.072 2.790 2.000 2.735 2.861 1427 0.000 1244.000 1758.000 1326 0.000 1072.000 1569.000 5 0.000 2.000 4.000 48 0.000 43.000 56.000 1 0.000 0.000 0.000 525.000 0.000 420.000 577.000 36 0.000 20.000 36.000 1108.000 1.000 1025.000 1430.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 66 0.000 44.000 46.000 1 0.000 1.000 0.000 0 0.000 0.000 0.000 1.013 1.013 4.265 4.265 1.013 3.181 4.265 4.265 1.013 3.615 0.000 0.000 1.939 3.542 4.265 4.265 3.057 6.581 9.281 8.359 0.001 0.001 0.054 0.054 0.001 0.036 0.054 0.054 0.001 0.043 0.000 0.000 0.016 0.042 0.054 0.054 0.011 0.072 0.118 0.106 0.256 0.256 0.633 0.633 0.256 0.507 0.633 0.633 0.256 0.558 0.000 0.000 0.363 0.549 0.633 0.633 0.697 1.082 1.377 1.241 0.213 0.213 1.854 1.854 0.213 1.307 1.854 1.854 0.213 1.526 0.000 0.000 0.680 1.489 1.854 1.854 0.805 2.634 4.034 3.634 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.026 0.026 0.332 0.332 0.026 0.230 0.332 0.332 0.026 0.271 0.000 0.000 0.113 0.264 0.332 0.332 0.116 0.459 0.722 0.651 0.015 0.015 0.137 0.137 0.015 0.096 0.137 0.137 0.015 0.113 0.000 0.000 0.050 0.110 0.137 0.137 0.058 0.194 0.298 0.269 0.503 0.503 1.256 1.256 0.503 1.005 1.256 1.256 0.503 1.105 0.000 0.000 0.717 1.089 1.256 1.256 1.372 2.143 2.733 2.462 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 59.987 59.987 54.600 54.600 59.987 56.395 54.600 54.600 59.987 55.677 0.000 0.000 1206.000 1206.000 1177.000 1177.000 1206.000 1186.667 1177.000 1177.000 1206.000 1182.800 0.000 0.000 32.000 28.500 33.500 31.500 17.500 15.000 16.000 14.500 24.460 21.650 24.567 23.060 0.000 0.000 0.000 0.000 6.460 3.650 6.567 5.060 12.120 0.350 4.633 1.080 0.000 0.000 0.000 0.000 12.120 0.350 4.633 1.080 0.000 0.000 0.000 0.000 0 0.000 0.000 0.000
31 4190321.000 3875305.000 2599.000 3451151.000 35.737 35.972 16.130 32.208 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 5.577 5.599 9.000 7.201 2.897 2.790 2.000 2.735 1655 1427.000 0.000 1244.000 1508 1326.000 0.000 1072.000 7 5.000 0.000 2.000 53 48.000 0.000 43.000 3 1.000 0.000 0.000 609.000 525.000 0.000 420.000 35 36.000 0.000 20.000 1215.000 1108.000 1.000 1025.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 81 66.000 0.000 44.000 0 1.000 0.000 1.000 0 0.000 0.000 0.000 24.064 1.013 1.013 4.265 8.697 1.013 3.181 4.265 5.623 1.013 3.615 0.000 6.855 1.939 3.542 4.265 25.898 3.057 6.581 9.281 0.970 0.001 0.001 0.054 0.324 0.001 0.036 0.054 0.195 0.001 0.043 0.000 0.228 0.016 0.042 0.054 0.977 0.011 0.072 0.118 0.339 0.256 0.256 0.633 0.284 0.256 0.507 0.633 0.273 0.256 0.558 0.000 0.358 0.363 0.549 0.633 0.757 0.697 1.082 1.377 15.697 0.213 0.213 1.854 5.374 0.213 1.307 1.854 3.310 0.213 1.526 0.000 4.017 0.680 1.489 1.854 16.180 0.805 2.634 4.034 0.153 0.000 0.000 0.000 0.051 0.000 0.000 0.000 0.031 0.000 0.000 0.000 0.034 0.000 0.000 0.000 0.153 0.000 0.000 0.000 4.095 0.026 0.026 0.332 1.382 0.026 0.230 0.332 0.840 0.026 0.271 0.000 0.998 0.113 0.264 0.332 4.165 0.116 0.459 0.722 1.260 0.015 0.015 0.137 0.430 0.015 0.096 0.137 0.264 0.015 0.113 0.000 0.319 0.050 0.110 0.137 1.295 0.058 0.194 0.298 1.551 0.503 0.503 1.256 0.852 0.503 1.005 1.256 0.713 0.503 1.105 0.000 0.903 0.717 1.089 1.256 2.374 1.372 2.143 2.733 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 46.925 59.987 59.987 54.600 55.633 59.987 56.395 54.600 57.375 59.987 55.677 0.000 1101.000 1206.000 1206.000 1177.000 1171.000 1206.000 1186.667 1177.000 1185.000 1206.000 1182.800 0.000 32.500 32.000 28.500 33.500 9.000 17.500 15.000 16.000 19.240 24.460 21.650 24.567 1.280 0.000 0.000 0.000 2.520 6.460 3.650 6.567 0.960 12.120 0.350 4.633 0.000 0.000 0.000 0.000 0.960 12.120 0.350 4.633 0.000 0.000 0.000 0.000 0 0.000 0.000 0.000
32 3740780.000 4190321.000 3875305.000 2599.000 35.231 35.737 35.972 16.130 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 6.246 5.577 5.599 9.000 2.696 2.897 2.790 2.000 1459 1655.000 1427.000 0.000 1336 1508.000 1326.000 0.000 7 7.000 5.000 0.000 52 53.000 48.000 0.000 4 3.000 1.000 0.000 532.000 609.000 525.000 0.000 32 35.000 36.000 0.000 1110.000 1215.000 1108.000 1.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 45 81.000 66.000 0.000 1 0.000 1.000 0.000 0 0.000 0.000 0.000 24.064 24.064 1.013 1.013 16.380 8.697 1.013 3.181 10.233 5.623 1.013 3.615 10.680 6.855 1.939 3.542 39.603 25.898 3.057 6.581 0.970 0.970 0.001 0.001 0.647 0.324 0.001 0.036 0.389 0.195 0.001 0.043 0.393 0.228 0.016 0.042 1.556 0.977 0.011 0.072 0.339 0.339 0.256 0.256 0.311 0.284 0.256 0.507 0.289 0.273 0.256 0.558 0.354 0.358 0.363 0.549 0.793 0.757 0.697 1.082 15.697 15.697 0.213 0.213 10.536 5.374 0.213 1.307 6.407 3.310 0.213 1.526 6.613 4.017 0.680 1.489 25.405 16.180 0.805 2.634 0.153 0.153 0.000 0.000 0.102 0.051 0.000 0.000 0.061 0.031 0.000 0.000 0.060 0.034 0.000 0.000 0.245 0.153 0.000 0.000 4.095 4.095 0.026 0.026 2.739 1.382 0.026 0.230 1.654 0.840 0.026 0.271 1.686 0.998 0.113 0.264 6.594 4.165 0.116 0.459 1.260 1.260 0.015 0.015 0.845 0.430 0.015 0.096 0.513 0.264 0.015 0.113 0.528 0.319 0.050 0.110 2.037 1.295 0.058 0.194 1.551 1.551 0.503 0.503 1.202 0.852 0.503 1.005 0.922 0.713 0.503 1.105 1.047 0.903 0.717 1.089 2.976 2.374 1.372 2.143 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 46.925 46.925 59.987 59.987 51.279 55.633 59.987 56.395 54.762 57.375 59.987 55.677 1101.000 1101.000 1206.000 1206.000 1136.000 1171.000 1206.000 1186.667 1164.000 1185.000 1206.000 1182.800 27.500 32.500 32.000 28.500 13.000 9.000 17.500 15.000 20.550 19.240 24.460 21.650 0.000 1.280 0.000 0.000 2.550 2.520 6.460 3.650 1.100 0.960 12.120 0.350 0.000 0.000 0.000 0.000 1.100 0.960 12.120 0.350 0.000 0.000 0.000 0.000 0 0.000 0.000 0.000
33 4212446.000 3740780.000 4190321.000 3875305.000 34.340 35.231 35.737 35.972 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 6.332 6.246 5.577 5.599 2.673 2.696 2.897 2.790 1745 1459.000 1655.000 1427.000 1686 1336.000 1508.000 1326.000 6 7.000 7.000 5.000 52 52.000 53.000 48.000 3 4.000 3.000 1.000 716.000 532.000 609.000 525.000 30 32.000 35.000 36.000 1276.000 1110.000 1215.000 1108.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 69 45.000 81.000 66.000 0 1.000 0.000 1.000 0 0.000 0.000 0.000 24.064 24.064 24.064 1.013 24.064 16.380 8.697 1.013 14.844 10.233 5.623 1.013 13.654 10.680 6.855 1.939 47.826 39.603 25.898 3.057 0.970 0.970 0.970 0.001 0.970 0.647 0.324 0.001 0.582 0.389 0.195 0.001 0.521 0.393 0.228 0.016 1.904 1.556 0.977 0.011 0.339 0.339 0.339 0.256 0.339 0.311 0.284 0.256 0.306 0.289 0.273 0.256 0.350 0.354 0.358 0.363 0.815 0.793 0.757 0.697 15.697 15.697 15.697 0.213 15.697 10.536 5.374 0.213 9.503 6.407 3.310 0.213 8.631 6.613 4.017 0.680 30.940 25.405 16.180 0.805 0.153 0.153 0.153 0.000 0.153 0.102 0.051 0.000 0.092 0.061 0.031 0.000 0.081 0.060 0.034 0.000 0.300 0.245 0.153 0.000 4.095 4.095 4.095 0.026 4.095 2.739 1.382 0.026 2.467 1.654 0.840 0.026 2.221 1.686 0.998 0.113 8.051 6.594 4.165 0.116 1.260 1.260 1.260 0.015 1.260 0.845 0.430 0.015 0.762 0.513 0.264 0.015 0.691 0.528 0.319 0.050 2.482 2.037 1.295 0.058 1.551 1.551 1.551 0.503 1.551 1.202 0.852 0.503 1.132 0.922 0.713 0.503 1.159 1.047 0.903 0.717 3.336 2.976 2.374 1.372 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 46.925 46.925 46.925 59.987 46.925 51.279 55.633 59.987 52.150 54.762 57.375 59.987 1101.000 1101.000 1101.000 1206.000 1101.000 1136.000 1171.000 1206.000 1143.000 1164.000 1185.000 1206.000 25.500 27.500 32.500 32.000 14.500 13.000 9.000 17.500 20.000 20.550 19.240 24.460 0.000 0.000 1.280 0.000 2.000 2.550 2.520 6.460 0.000 1.100 0.960 12.120 0.000 0.000 0.000 0.000 0.000 1.100 0.960 12.120 0.000 0.000 0.000 0.000 0 0.000 0.000 0.000
34 4149262.000 4212446.000 3740780.000 4190321.000 33.715 34.340 35.231 35.737 0.004 0.000 0.000 0.000 0.005 0.000 0.000 0.000 6.394 6.332 6.246 5.577 2.515 2.673 2.696 2.897 1653 1745.000 1459.000 1655.000 1519 1686.000 1336.000 1508.000 4 6.000 7.000 7.000 52 52.000 52.000 53.000 1 3.000 4.000 3.000 644.000 716.000 532.000 609.000 30 30.000 32.000 35.000 1262.000 1276.000 1110.000 1215.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 0 0.000 0.000 0.000 58 69.000 45.000 81.000 1 0.000 1.000 0.000 0 0.000 0.000 0.000 24.064 24.064 24.064 24.064 24.064 24.064 16.380 8.697 19.454 14.844 10.233 5.623 15.967 13.654 10.680 6.855 52.759 47.826 39.603 25.898 0.970 0.970 0.970 0.970 0.970 0.970 0.647 0.324 0.776 0.582 0.389 0.195 0.621 0.521 0.393 0.228 2.112 1.904 1.556 0.977 0.339 0.339 0.339 0.339 0.339 0.339 0.311 0.284 0.322 0.306 0.289 0.273 0.348 0.350 0.354 0.358 0.828 0.815 0.793 0.757 15.697 15.697 15.697 15.697 15.697 15.697 10.536 5.374 12.600 9.503 6.407 3.310 10.202 8.631 6.613 4.017 34.261 30.940 25.405 16.180 0.153 0.153 0.153 0.153 0.153 0.153 0.102 0.051 0.122 0.092 0.061 0.031 0.097 0.081 0.060 0.034 0.333 0.300 0.245 0.153 4.095 4.095 4.095 4.095 4.095 4.095 2.739 1.382 3.281 2.467 1.654 0.840 2.638 2.221 1.686 0.998 8.926 8.051 6.594 4.165 1.260 1.260 1.260 1.260 1.260 1.260 0.845 0.430 1.011 0.762 0.513 0.264 0.817 0.691 0.528 0.319 2.749 2.482 2.037 1.295 1.551 1.551 1.551 1.551 1.551 1.551 1.202 0.852 1.341 1.132 0.922 0.713 1.246 1.159 1.047 0.903 3.553 3.336 2.976 2.374 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 46.925 46.925 46.925 46.925 46.925 46.925 51.279 55.633 49.538 52.150 54.762 57.375 1101.000 1101.000 1101.000 1101.000 1101.000 1101.000 1136.000 1171.000 1122.000 1143.000 1164.000 1185.000 26.500 25.500 27.500 32.500 8.000 14.500 13.000 9.000 17.725 20.000 20.550 19.240 1.925 0.000 0.000 1.280 1.650 2.000 2.550 2.520 2.450 0.000 1.100 0.960 0.000 0.000 0.000 0.000 2.450 0.000 1.100 0.960 0.000 0.000 0.000 0.000 0 0.000 0.000 0.000
In [380]:
# Checking for total count and percentage of null values in all columns of the dataframe.

total = pd.DataFrame(homeaudio_dladd_df.isnull().sum().sort_values(ascending=False), columns=['Total'])
percentage = pd.DataFrame(round(100*(homeaudio_dladd_df.isnull().sum()/homeaudio_dladd_df.shape[0]),2).sort_values(ascending=False)\
                          ,columns=['Percentage'])

pd.concat([total, percentage], axis = 1).head()
Out[380]:
Total Percentage
Sale_lag3 0 0.000
TV_Ad_Stock_lag1 0 0.000
TV_SMA_5_lag1 0 0.000
TV_SMA_5_lag2 0 0.000
TV_SMA_5_lag3 0 0.000

Taking Logarithm of both the Dependent and the independent variables

After taking log, all 0 values will be transformed to inf. Retransforming these values to 0.
In [381]:
cameraaccessory_dlmul_df = cameraaccessory_dlmul_df.applymap(lambda x: np.log(x))
cameraaccessory_dlmul_df = cameraaccessory_dlmul_df.replace([np.inf, -np.inf], 0)
cameraaccessory_dlmul_df = cameraaccessory_dlmul_df.replace(np.nan, 0)

gamingaccessory_dlmul_df = gamingaccessory_dlmul_df.applymap(lambda x: np.log(x))
gamingaccessory_dlmul_df = gamingaccessory_dlmul_df.replace([np.inf, -np.inf], 0)
gamingaccessory_dlmul_df = gamingaccessory_dlmul_df.replace(np.nan, 0)

homeaudio_dlmul_df = homeaudio_dlmul_df.applymap(lambda x: np.log(x))
homeaudio_dlmul_df = homeaudio_dlmul_df.replace([np.inf, -np.inf], 0)
homeaudio_dlmul_df = homeaudio_dlmul_df.replace(np.nan, 0)


homeaudio_dlmul_df.head()
Out[381]:
gmv gmv_lag1 gmv_lag2 gmv_lag3 Discount% Discount%_lag1 Discount%_lag2 Discount%_lag3 deliverybdays deliverybdays_lag1 deliverybdays_lag2 deliverybdays_lag3 deliverycdays deliverycdays_lag1 deliverycdays_lag2 deliverycdays_lag3 sla sla_lag1 sla_lag2 sla_lag3 product_procurement_sla product_procurement_sla_lag1 product_procurement_sla_lag2 product_procurement_sla_lag3 is_cod is_cod_lag1 is_cod_lag2 is_cod_lag3 is_mass_market is_mass_market_lag1 is_mass_market_lag2 is_mass_market_lag3 product_vertical_djcontroller product_vertical_djcontroller_lag1 product_vertical_djcontroller_lag2 product_vertical_djcontroller_lag3 product_vertical_dock product_vertical_dock_lag1 product_vertical_dock_lag2 product_vertical_dock_lag3 product_vertical_dockingstation product_vertical_dockingstation_lag1 product_vertical_dockingstation_lag2 product_vertical_dockingstation_lag3 product_vertical_fmradio product_vertical_fmradio_lag1 product_vertical_fmradio_lag2 product_vertical_fmradio_lag3 product_vertical_hifisystem product_vertical_hifisystem_lag1 product_vertical_hifisystem_lag2 product_vertical_hifisystem_lag3 product_vertical_homeaudiospeaker product_vertical_homeaudiospeaker_lag1 product_vertical_homeaudiospeaker_lag2 product_vertical_homeaudiospeaker_lag3 product_vertical_karaokeplayer product_vertical_karaokeplayer_lag1 product_vertical_karaokeplayer_lag2 product_vertical_karaokeplayer_lag3 product_vertical_slingbox product_vertical_slingbox_lag1 product_vertical_slingbox_lag2 product_vertical_slingbox_lag3 product_vertical_soundmixer product_vertical_soundmixer_lag1 product_vertical_soundmixer_lag2 product_vertical_soundmixer_lag3 product_vertical_voicerecorder product_vertical_voicerecorder_lag1 product_vertical_voicerecorder_lag2 product_vertical_voicerecorder_lag3 payday_week payday_week_lag1 payday_week_lag2 payday_week_lag3 holiday_week holiday_week_lag1 holiday_week_lag2 holiday_week_lag3 Total Investment Total Investment_lag1 Total Investment_lag2 Total Investment_lag3 Total Investment_SMA_3 Total Investment_SMA_3_lag1 Total Investment_SMA_3_lag2 Total Investment_SMA_3_lag3 Total Investment_SMA_5 Total Investment_SMA_5_lag1 Total Investment_SMA_5_lag2 Total Investment_SMA_5_lag3 Total Investment_EMA_8 Total Investment_EMA_8_lag1 Total Investment_EMA_8_lag2 Total Investment_EMA_8_lag3 Total_Investment_Ad_Stock Total_Investment_Ad_Stock_lag1 Total_Investment_Ad_Stock_lag2 Total_Investment_Ad_Stock_lag3 TV TV_lag1 TV_lag2 TV_lag3 TV_SMA_3 TV_SMA_3_lag1 TV_SMA_3_lag2 TV_SMA_3_lag3 TV_SMA_5 TV_SMA_5_lag1 TV_SMA_5_lag2 TV_SMA_5_lag3 TV_EMA_8 TV_EMA_8_lag1 TV_EMA_8_lag2 TV_EMA_8_lag3 TV_Ad_Stock TV_Ad_Stock_lag1 TV_Ad_Stock_lag2 TV_Ad_Stock_lag3 Digital Digital_lag1 Digital_lag2 Digital_lag3 Digital_SMA_3 Digital_SMA_3_lag1 Digital_SMA_3_lag2 Digital_SMA_3_lag3 Digital_SMA_5 Digital_SMA_5_lag1 Digital_SMA_5_lag2 Digital_SMA_5_lag3 Digital_EMA_8 Digital_EMA_8_lag1 Digital_EMA_8_lag2 Digital_EMA_8_lag3 Digital_Ad_Stock Digital_Ad_Stock_lag1 Digital_Ad_Stock_lag2 Digital_Ad_Stock_lag3 Sponsorship Sponsorship_lag1 Sponsorship_lag2 Sponsorship_lag3 Sponsorship_SMA_3 Sponsorship_SMA_3_lag1 Sponsorship_SMA_3_lag2 Sponsorship_SMA_3_lag3 Sponsorship_SMA_5 Sponsorship_SMA_5_lag1 Sponsorship_SMA_5_lag2 Sponsorship_SMA_5_lag3 Sponsorship_EMA_8 Sponsorship_EMA_8_lag1 Sponsorship_EMA_8_lag2 Sponsorship_EMA_8_lag3 Sponsorship_Ad_Stock Sponsorship_Ad_Stock_lag1 Sponsorship_Ad_Stock_lag2 Sponsorship_Ad_Stock_lag3 Content Marketing Content Marketing_lag1 Content Marketing_lag2 Content Marketing_lag3 Content Marketing_SMA_3 Content Marketing_SMA_3_lag1 Content Marketing_SMA_3_lag2 Content Marketing_SMA_3_lag3 Content Marketing_SMA_5 Content Marketing_SMA_5_lag1 Content Marketing_SMA_5_lag2 Content Marketing_SMA_5_lag3 Content Marketing_EMA_8 Content Marketing_EMA_8_lag1 Content Marketing_EMA_8_lag2 Content Marketing_EMA_8_lag3 Content_Marketing_Ad_Stock Content_Marketing_Ad_Stock_lag1 Content_Marketing_Ad_Stock_lag2 Content_Marketing_Ad_Stock_lag3 Online marketing Online marketing_lag1 Online marketing_lag2 Online marketing_lag3 Online marketing_SMA_3 Online marketing_SMA_3_lag1 Online marketing_SMA_3_lag2 Online marketing_SMA_3_lag3 Online marketing_SMA_5 Online marketing_SMA_5_lag1 Online marketing_SMA_5_lag2 Online marketing_SMA_5_lag3 Online marketing_EMA_8 Online marketing_EMA_8_lag1 Online marketing_EMA_8_lag2 Online marketing_EMA_8_lag3 Online_marketing_Ad_Stock Online_marketing_Ad_Stock_lag1 Online_marketing_Ad_Stock_lag2 Online_marketing_Ad_Stock_lag3 Affiliates Affiliates_lag1 Affiliates_lag2 Affiliates_lag3 Affiliates_SMA_3 Affiliates_SMA_3_lag1 Affiliates_SMA_3_lag2 Affiliates_SMA_3_lag3 Affiliates_SMA_5 Affiliates_SMA_5_lag1 Affiliates_SMA_5_lag2 Affiliates_SMA_5_lag3 Affiliates_EMA_8 Affiliates_EMA_8_lag1 Affiliates_EMA_8_lag2 Affiliates_EMA_8_lag3 Affiliates_Ad_Stock Affiliates_Ad_Stock_lag1 Affiliates_Ad_Stock_lag2 Affiliates_Ad_Stock_lag3 SEM SEM_lag1 SEM_lag2 SEM_lag3 SEM_SMA_3 SEM_SMA_3_lag1 SEM_SMA_3_lag2 SEM_SMA_3_lag3 SEM_SMA_5 SEM_SMA_5_lag1 SEM_SMA_5_lag2 SEM_SMA_5_lag3 SEM_EMA_8 SEM_EMA_8_lag1 SEM_EMA_8_lag2 SEM_EMA_8_lag3 SEM_Ad_Stock SEM_Ad_Stock_lag1 SEM_Ad_Stock_lag2 SEM_Ad_Stock_lag3 Radio Radio_lag1 Radio_lag2 Radio_lag3 Radio_SMA_3 Radio_SMA_3_lag1 Radio_SMA_3_lag2 Radio_SMA_3_lag3 Radio_SMA_5 Radio_SMA_5_lag1 Radio_SMA_5_lag2 Radio_SMA_5_lag3 Radio_EMA_8 Radio_EMA_8_lag1 Radio_EMA_8_lag2 Radio_EMA_8_lag3 Radio_Ad_Stock Radio_Ad_Stock_lag1 Radio_Ad_Stock_lag2 Radio_Ad_Stock_lag3 Other Other_lag1 Other_lag2 Other_lag3 Other_SMA_3 Other_SMA_3_lag1 Other_SMA_3_lag2 Other_SMA_3_lag3 Other_SMA_5 Other_SMA_5_lag1 Other_SMA_5_lag2 Other_SMA_5_lag3 Other_EMA_8 Other_EMA_8_lag1 Other_EMA_8_lag2 Other_EMA_8_lag3 Other_Ad_Stock Other_Ad_Stock_lag1 Other_Ad_Stock_lag2 Other_Ad_Stock_lag3 NPS NPS_lag1 NPS_lag2 NPS_lag3 NPS_SMA_3 NPS_SMA_3_lag1 NPS_SMA_3_lag2 NPS_SMA_3_lag3 NPS_SMA_5 NPS_SMA_5_lag1 NPS_SMA_5_lag2 NPS_SMA_5_lag3 Stock Index Stock Index_lag1 Stock Index_lag2 Stock Index_lag3 Stock Index_SMA_3 Stock Index_SMA_3_lag1 Stock Index_SMA_3_lag2 Stock Index_SMA_3_lag3 Stock Index_SMA_5 Stock Index_SMA_5_lag1 Stock Index_SMA_5_lag2 Stock Index_SMA_5_lag3 Max Temp Max Temp_lag1 Max Temp_lag2 Max Temp_lag3 Min Temp Min Temp_lag1 Min Temp_lag2 Min Temp_lag3 Mean Temp Mean Temp_lag1 Mean Temp_lag2 Mean Temp_lag3 Heat Deg Days Heat Deg Days_lag1 Heat Deg Days_lag2 Heat Deg Days_lag3 Cool Deg Days Cool Deg Days_lag1 Cool Deg Days_lag2 Cool Deg Days_lag3 Total Rain (mm) Total Rain (mm)_lag1 Total Rain (mm)_lag2 Total Rain (mm)_lag3 Total Snow (cm) Total Snow (cm)_lag1 Total Snow (cm)_lag2 Total Snow (cm)_lag3 Total Precip (mm) Total Precip (mm)_lag1 Total Precip (mm)_lag2 Total Precip (mm)_lag3 Snow on Grnd (cm) Snow on Grnd (cm)_lag1 Snow on Grnd (cm)_lag2 Snow on Grnd (cm)_lag3 Sale Sale_lag1 Sale_lag2 Sale_lag3
25 15.336 0.000 0.000 0.000 3.448 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.997 0.000 0.000 0.000 1.052 0.000 0.000 0.000 7.367 0.000 0.000 0.000 7.220 0.000 0.000 0.000 2.079 0.000 0.000 0.000 3.497 0.000 0.000 0.000 0.000 0.000 0.000 0.000 6.246 0.000 0.000 0.000 3.135 0.000 0.000 0.000 7.225 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 4.143 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.450 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.450 0.000 0.000 0.000 1.450 0.000 0.000 0.000 -2.919 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 -2.919 0.000 0.000 0.000 -2.919 0.000 0.000 0.000 -0.457 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 -0.457 0.000 0.000 0.000 -0.457 0.000 0.000 0.000 0.617 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.617 0.000 0.000 0.000 0.617 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 -1.103 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 -1.103 0.000 0.000 0.000 -1.103 0.000 0.000 0.000 -1.988 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 -1.988 0.000 0.000 0.000 -1.988 0.000 0.000 0.000 0.228 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.228 0.000 0.000 0.000 0.228 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 4.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 7.071 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 3.332 0.000 0.000 0.000 2.526 0.000 0.000 0.000 3.001 0.000 0.000 0.000 -1.261 0.000 0.000 0.000 0.869 0.000 0.000 0.000 1.485 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.485 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
26 15.497 15.336 0.000 0.000 3.495 3.448 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.944 1.997 0.000 0.000 1.010 1.052 0.000 0.000 7.533 7.367 0.000 0.000 7.384 7.220 0.000 0.000 1.946 2.079 0.000 0.000 3.912 3.497 0.000 0.000 0.000 0.000 0.000 0.000 6.353 6.246 0.000 0.000 3.738 3.135 0.000 0.000 7.392 7.225 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 4.234 4.143 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.450 1.450 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.450 1.450 0.000 0.000 1.920 1.450 0.000 0.000 -2.919 -2.919 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 -2.919 -2.919 0.000 0.000 -2.449 -2.919 0.000 0.000 -0.457 -0.457 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 -0.457 -0.457 0.000 0.000 0.013 -0.457 0.000 0.000 0.617 0.617 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.617 0.617 0.000 0.000 1.087 0.617 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 -1.103 -1.103 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 -1.103 -1.103 0.000 0.000 -0.633 -1.103 0.000 0.000 -1.988 -1.988 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 -1.988 -1.988 0.000 0.000 -1.518 -1.988 0.000 0.000 0.228 0.228 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.228 0.228 0.000 0.000 0.698 0.228 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 4.000 4.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 7.071 7.071 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 3.497 3.332 0.000 0.000 2.398 2.526 0.000 0.000 3.143 3.001 0.000 0.000 0.000 -1.261 0.000 0.000 1.645 0.869 0.000 0.000 0.336 1.485 0.000 0.000 0.000 0.000 0.000 0.000 0.336 1.485 0.000 0.000 0.000 0.000 0.000 0.000 0.693 0.000 0.000 0.000
27 15.359 15.497 15.336 0.000 3.477 3.495 3.448 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.956 1.944 1.997 0.000 1.051 1.010 1.052 0.000 7.472 7.533 7.367 0.000 7.358 7.384 7.220 0.000 1.386 1.946 2.079 0.000 4.025 3.912 3.497 0.000 0.000 0.000 0.000 0.000 6.358 6.353 6.246 0.000 3.584 3.738 3.135 0.000 7.265 7.392 7.225 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 3.829 4.234 4.143 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.450 1.450 1.450 0.000 1.450 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.450 1.450 1.450 0.000 2.123 1.920 1.450 0.000 -2.919 -2.919 -2.919 0.000 -2.919 0.000 0.000 0.000 0.000 0.000 0.000 0.000 -2.919 -2.919 -2.919 0.000 -2.246 -2.449 -2.919 0.000 -0.457 -0.457 -0.457 0.000 -0.457 0.000 0.000 0.000 0.000 0.000 0.000 0.000 -0.457 -0.457 -0.457 0.000 0.216 0.013 -0.457 0.000 0.617 0.617 0.617 0.000 0.617 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.617 0.617 0.617 0.000 1.290 1.087 0.617 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 -1.103 -1.103 -1.103 0.000 -1.103 0.000 0.000 0.000 0.000 0.000 0.000 0.000 -1.103 -1.103 -1.103 0.000 -0.430 -0.633 -1.103 0.000 -1.988 -1.988 -1.988 0.000 -1.988 0.000 0.000 0.000 0.000 0.000 0.000 0.000 -1.988 -1.988 -1.988 0.000 -1.315 -1.518 -1.988 0.000 0.228 0.228 0.228 0.000 0.228 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.228 0.228 0.228 0.000 0.901 0.698 0.228 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 4.000 4.000 4.000 0.000 4.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 7.071 7.071 7.071 0.000 7.071 0.000 0.000 0.000 0.000 0.000 0.000 0.000 3.450 3.497 3.332 0.000 2.674 2.398 2.526 0.000 3.138 3.143 3.001 0.000 0.000 0.000 -1.261 0.000 1.621 1.645 0.869 0.000 0.077 0.336 1.485 0.000 0.000 0.000 0.000 0.000 0.077 0.336 1.485 0.000 0.000 0.000 0.000 0.000 0.000 0.693 0.000 0.000
28 15.054 15.359 15.497 15.336 3.472 3.477 3.495 3.448 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.974 1.956 1.944 1.997 1.006 1.051 1.010 1.052 7.126 7.472 7.533 7.367 6.977 7.358 7.384 7.220 0.693 1.386 1.946 2.079 3.761 4.025 3.912 3.497 0.000 0.000 0.000 0.000 6.040 6.358 6.353 6.246 2.996 3.584 3.738 3.135 6.932 7.265 7.392 7.225 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 3.784 3.829 4.234 4.143 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.450 1.450 1.450 1.450 1.450 1.450 0.000 0.000 0.000 0.000 0.000 0.000 1.450 1.450 1.450 1.450 2.228 2.123 1.920 1.450 -2.919 -2.919 -2.919 -2.919 -2.919 -2.919 0.000 0.000 0.000 0.000 0.000 0.000 -2.919 -2.919 -2.919 -2.919 -2.141 -2.246 -2.449 -2.919 -0.457 -0.457 -0.457 -0.457 -0.457 -0.457 0.000 0.000 0.000 0.000 0.000 0.000 -0.457 -0.457 -0.457 -0.457 0.320 0.216 0.013 -0.457 0.617 0.617 0.617 0.617 0.617 0.617 0.000 0.000 0.000 0.000 0.000 0.000 0.617 0.617 0.617 0.617 1.395 1.290 1.087 0.617 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 -1.103 -1.103 -1.103 -1.103 -1.103 -1.103 0.000 0.000 0.000 0.000 0.000 0.000 -1.103 -1.103 -1.103 -1.103 -0.325 -0.430 -0.633 -1.103 -1.988 -1.988 -1.988 -1.988 -1.988 -1.988 0.000 0.000 0.000 0.000 0.000 0.000 -1.988 -1.988 -1.988 -1.988 -1.210 -1.315 -1.518 -1.988 0.228 0.228 0.228 0.228 0.228 0.228 0.000 0.000 0.000 0.000 0.000 0.000 0.228 0.228 0.228 0.228 1.005 0.901 0.698 0.228 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 4.000 4.000 4.000 4.000 4.000 4.000 0.000 0.000 0.000 0.000 0.000 0.000 7.071 7.071 7.071 7.071 7.071 7.071 0.000 0.000 0.000 0.000 0.000 0.000 3.512 3.450 3.497 3.332 2.773 2.674 2.398 2.526 3.201 3.138 3.143 3.001 0.000 0.000 0.000 -1.261 1.882 1.621 1.645 0.869 1.533 0.077 0.336 1.485 0.000 0.000 0.000 0.000 1.533 0.077 0.336 1.485 0.000 0.000 0.000 0.000 0.000 0.000 0.693 0.000
29 7.863 15.054 15.359 15.497 2.781 3.472 3.477 3.495 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 2.197 1.974 1.956 1.944 0.693 1.006 1.051 1.010 0.000 7.126 7.472 7.533 0.000 6.977 7.358 7.384 0.000 0.693 1.386 1.946 0.000 3.761 4.025 3.912 0.000 0.000 0.000 0.000 0.000 6.040 6.358 6.353 0.000 2.996 3.584 3.738 0.000 6.932 7.265 7.392 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 3.784 3.829 4.234 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.013 1.450 1.450 1.450 1.157 1.450 1.450 0.000 1.285 0.000 0.000 0.000 1.265 1.450 1.450 1.450 1.884 2.228 2.123 1.920 -6.908 -2.919 -2.919 -2.919 -3.315 -2.919 -2.919 0.000 -3.137 0.000 0.000 0.000 -3.165 -2.919 -2.919 -2.919 -2.638 -2.141 -2.246 -2.449 -1.363 -0.457 -0.457 -0.457 -0.679 -0.457 -0.457 0.000 -0.584 0.000 0.000 0.000 -0.599 -0.457 -0.457 -0.457 0.079 0.320 0.216 0.013 -1.546 0.617 0.617 0.617 0.268 0.617 0.617 0.000 0.423 0.000 0.000 0.000 0.398 0.617 0.617 0.617 0.968 1.395 1.290 1.087 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 -3.650 -1.103 -1.103 -1.103 -1.470 -1.103 -1.103 0.000 -1.306 0.000 0.000 0.000 -1.332 -1.103 -1.103 -1.103 -0.778 -0.325 -0.430 -0.633 -4.200 -1.988 -1.988 -1.988 -2.340 -1.988 -1.988 0.000 -2.184 0.000 0.000 0.000 -2.208 -1.988 -1.988 -1.988 -1.641 -1.210 -1.315 -1.518 -0.687 0.228 0.228 0.228 0.005 0.228 0.228 0.000 0.100 0.000 0.000 0.000 0.085 0.228 0.228 0.228 0.762 1.005 0.901 0.698 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 4.094 4.000 4.000 4.000 4.032 4.000 4.000 0.000 4.020 0.000 0.000 0.000 7.095 7.071 7.071 7.071 7.079 7.071 7.071 0.000 7.076 0.000 0.000 0.000 3.350 3.512 3.450 3.497 2.708 2.773 2.674 2.398 3.075 3.201 3.138 3.143 0.000 0.000 0.000 0.000 1.295 1.882 1.621 1.645 -1.050 1.533 0.077 0.336 0.000 0.000 0.000 0.000 -1.050 1.533 0.077 0.336 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.693
In [382]:
# Checking for total count and percentage of null values in all columns of the dataframe.

total = pd.DataFrame(homeaudio_dlmul_df.isnull().sum().sort_values(ascending=False), columns=['Total'])
percentage = pd.DataFrame(round(100*(homeaudio_dlmul_df.isnull().sum()/homeaudio_dlmul_df.shape[0]),2).sort_values(ascending=False)\
                          ,columns=['Percentage'])

pd.concat([total, percentage], axis = 1).head()
Out[382]:
Total Percentage
Sale_lag3 0 0.000
TV_Ad_Stock_lag1 0 0.000
TV_SMA_5_lag1 0 0.000
TV_SMA_5_lag2 0 0.000
TV_SMA_5_lag3 0 0.000

Rescaling the Features of the 3 Dataframes

We will use Standard scaling.

In [383]:
from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()

cameraaccessory_dlmul_df[cameraaccessory_dlmul_df.columns]=scaler.fit_transform(cameraaccessory_dlmul_df[cameraaccessory_dlmul_df.columns])
gamingaccessory_dlmul_df[gamingaccessory_dlmul_df.columns]=scaler.fit_transform(gamingaccessory_dlmul_df[gamingaccessory_dlmul_df.columns])
homeaudio_dlmul_df[homeaudio_dlmul_df.columns]=scaler.fit_transform(homeaudio_dlmul_df[homeaudio_dlmul_df.columns])

homeaudio_dlmul_df.head()
Out[383]:
gmv gmv_lag1 gmv_lag2 gmv_lag3 Discount% Discount%_lag1 Discount%_lag2 Discount%_lag3 deliverybdays deliverybdays_lag1 deliverybdays_lag2 deliverybdays_lag3 deliverycdays deliverycdays_lag1 deliverycdays_lag2 deliverycdays_lag3 sla sla_lag1 sla_lag2 sla_lag3 product_procurement_sla product_procurement_sla_lag1 product_procurement_sla_lag2 product_procurement_sla_lag3 is_cod is_cod_lag1 is_cod_lag2 is_cod_lag3 is_mass_market is_mass_market_lag1 is_mass_market_lag2 is_mass_market_lag3 product_vertical_djcontroller product_vertical_djcontroller_lag1 product_vertical_djcontroller_lag2 product_vertical_djcontroller_lag3 product_vertical_dock product_vertical_dock_lag1 product_vertical_dock_lag2 product_vertical_dock_lag3 product_vertical_dockingstation product_vertical_dockingstation_lag1 product_vertical_dockingstation_lag2 product_vertical_dockingstation_lag3 product_vertical_fmradio product_vertical_fmradio_lag1 product_vertical_fmradio_lag2 product_vertical_fmradio_lag3 product_vertical_hifisystem product_vertical_hifisystem_lag1 product_vertical_hifisystem_lag2 product_vertical_hifisystem_lag3 product_vertical_homeaudiospeaker product_vertical_homeaudiospeaker_lag1 product_vertical_homeaudiospeaker_lag2 product_vertical_homeaudiospeaker_lag3 product_vertical_karaokeplayer product_vertical_karaokeplayer_lag1 product_vertical_karaokeplayer_lag2 product_vertical_karaokeplayer_lag3 product_vertical_slingbox product_vertical_slingbox_lag1 product_vertical_slingbox_lag2 product_vertical_slingbox_lag3 product_vertical_soundmixer product_vertical_soundmixer_lag1 product_vertical_soundmixer_lag2 product_vertical_soundmixer_lag3 product_vertical_voicerecorder product_vertical_voicerecorder_lag1 product_vertical_voicerecorder_lag2 product_vertical_voicerecorder_lag3 payday_week payday_week_lag1 payday_week_lag2 payday_week_lag3 holiday_week holiday_week_lag1 holiday_week_lag2 holiday_week_lag3 Total Investment Total Investment_lag1 Total Investment_lag2 Total Investment_lag3 Total Investment_SMA_3 Total Investment_SMA_3_lag1 Total Investment_SMA_3_lag2 Total Investment_SMA_3_lag3 Total Investment_SMA_5 Total Investment_SMA_5_lag1 Total Investment_SMA_5_lag2 Total Investment_SMA_5_lag3 Total Investment_EMA_8 Total Investment_EMA_8_lag1 Total Investment_EMA_8_lag2 Total Investment_EMA_8_lag3 Total_Investment_Ad_Stock Total_Investment_Ad_Stock_lag1 Total_Investment_Ad_Stock_lag2 Total_Investment_Ad_Stock_lag3 TV TV_lag1 TV_lag2 TV_lag3 TV_SMA_3 TV_SMA_3_lag1 TV_SMA_3_lag2 TV_SMA_3_lag3 TV_SMA_5 TV_SMA_5_lag1 TV_SMA_5_lag2 TV_SMA_5_lag3 TV_EMA_8 TV_EMA_8_lag1 TV_EMA_8_lag2 TV_EMA_8_lag3 TV_Ad_Stock TV_Ad_Stock_lag1 TV_Ad_Stock_lag2 TV_Ad_Stock_lag3 Digital Digital_lag1 Digital_lag2 Digital_lag3 Digital_SMA_3 Digital_SMA_3_lag1 Digital_SMA_3_lag2 Digital_SMA_3_lag3 Digital_SMA_5 Digital_SMA_5_lag1 Digital_SMA_5_lag2 Digital_SMA_5_lag3 Digital_EMA_8 Digital_EMA_8_lag1 Digital_EMA_8_lag2 Digital_EMA_8_lag3 Digital_Ad_Stock Digital_Ad_Stock_lag1 Digital_Ad_Stock_lag2 Digital_Ad_Stock_lag3 Sponsorship Sponsorship_lag1 Sponsorship_lag2 Sponsorship_lag3 Sponsorship_SMA_3 Sponsorship_SMA_3_lag1 Sponsorship_SMA_3_lag2 Sponsorship_SMA_3_lag3 Sponsorship_SMA_5 Sponsorship_SMA_5_lag1 Sponsorship_SMA_5_lag2 Sponsorship_SMA_5_lag3 Sponsorship_EMA_8 Sponsorship_EMA_8_lag1 Sponsorship_EMA_8_lag2 Sponsorship_EMA_8_lag3 Sponsorship_Ad_Stock Sponsorship_Ad_Stock_lag1 Sponsorship_Ad_Stock_lag2 Sponsorship_Ad_Stock_lag3 Content Marketing Content Marketing_lag1 Content Marketing_lag2 Content Marketing_lag3 Content Marketing_SMA_3 Content Marketing_SMA_3_lag1 Content Marketing_SMA_3_lag2 Content Marketing_SMA_3_lag3 Content Marketing_SMA_5 Content Marketing_SMA_5_lag1 Content Marketing_SMA_5_lag2 Content Marketing_SMA_5_lag3 Content Marketing_EMA_8 Content Marketing_EMA_8_lag1 Content Marketing_EMA_8_lag2 Content Marketing_EMA_8_lag3 Content_Marketing_Ad_Stock Content_Marketing_Ad_Stock_lag1 Content_Marketing_Ad_Stock_lag2 Content_Marketing_Ad_Stock_lag3 Online marketing Online marketing_lag1 Online marketing_lag2 Online marketing_lag3 Online marketing_SMA_3 Online marketing_SMA_3_lag1 Online marketing_SMA_3_lag2 Online marketing_SMA_3_lag3 Online marketing_SMA_5 Online marketing_SMA_5_lag1 Online marketing_SMA_5_lag2 Online marketing_SMA_5_lag3 Online marketing_EMA_8 Online marketing_EMA_8_lag1 Online marketing_EMA_8_lag2 Online marketing_EMA_8_lag3 Online_marketing_Ad_Stock Online_marketing_Ad_Stock_lag1 Online_marketing_Ad_Stock_lag2 Online_marketing_Ad_Stock_lag3 Affiliates Affiliates_lag1 Affiliates_lag2 Affiliates_lag3 Affiliates_SMA_3 Affiliates_SMA_3_lag1 Affiliates_SMA_3_lag2 Affiliates_SMA_3_lag3 Affiliates_SMA_5 Affiliates_SMA_5_lag1 Affiliates_SMA_5_lag2 Affiliates_SMA_5_lag3 Affiliates_EMA_8 Affiliates_EMA_8_lag1 Affiliates_EMA_8_lag2 Affiliates_EMA_8_lag3 Affiliates_Ad_Stock Affiliates_Ad_Stock_lag1 Affiliates_Ad_Stock_lag2 Affiliates_Ad_Stock_lag3 SEM SEM_lag1 SEM_lag2 SEM_lag3 SEM_SMA_3 SEM_SMA_3_lag1 SEM_SMA_3_lag2 SEM_SMA_3_lag3 SEM_SMA_5 SEM_SMA_5_lag1 SEM_SMA_5_lag2 SEM_SMA_5_lag3 SEM_EMA_8 SEM_EMA_8_lag1 SEM_EMA_8_lag2 SEM_EMA_8_lag3 SEM_Ad_Stock SEM_Ad_Stock_lag1 SEM_Ad_Stock_lag2 SEM_Ad_Stock_lag3 Radio Radio_lag1 Radio_lag2 Radio_lag3 Radio_SMA_3 Radio_SMA_3_lag1 Radio_SMA_3_lag2 Radio_SMA_3_lag3 Radio_SMA_5 Radio_SMA_5_lag1 Radio_SMA_5_lag2 Radio_SMA_5_lag3 Radio_EMA_8 Radio_EMA_8_lag1 Radio_EMA_8_lag2 Radio_EMA_8_lag3 Radio_Ad_Stock Radio_Ad_Stock_lag1 Radio_Ad_Stock_lag2 Radio_Ad_Stock_lag3 Other Other_lag1 Other_lag2 Other_lag3 Other_SMA_3 Other_SMA_3_lag1 Other_SMA_3_lag2 Other_SMA_3_lag3 Other_SMA_5 Other_SMA_5_lag1 Other_SMA_5_lag2 Other_SMA_5_lag3 Other_EMA_8 Other_EMA_8_lag1 Other_EMA_8_lag2 Other_EMA_8_lag3 Other_Ad_Stock Other_Ad_Stock_lag1 Other_Ad_Stock_lag2 Other_Ad_Stock_lag3 NPS NPS_lag1 NPS_lag2 NPS_lag3 NPS_SMA_3 NPS_SMA_3_lag1 NPS_SMA_3_lag2 NPS_SMA_3_lag3 NPS_SMA_5 NPS_SMA_5_lag1 NPS_SMA_5_lag2 NPS_SMA_5_lag3 Stock Index Stock Index_lag1 Stock Index_lag2 Stock Index_lag3 Stock Index_SMA_3 Stock Index_SMA_3_lag1 Stock Index_SMA_3_lag2 Stock Index_SMA_3_lag3 Stock Index_SMA_5 Stock Index_SMA_5_lag1 Stock Index_SMA_5_lag2 Stock Index_SMA_5_lag3 Max Temp Max Temp_lag1 Max Temp_lag2 Max Temp_lag3 Min Temp Min Temp_lag1 Min Temp_lag2 Min Temp_lag3 Mean Temp Mean Temp_lag1 Mean Temp_lag2 Mean Temp_lag3 Heat Deg Days Heat Deg Days_lag1 Heat Deg Days_lag2 Heat Deg Days_lag3 Cool Deg Days Cool Deg Days_lag1 Cool Deg Days_lag2 Cool Deg Days_lag3 Total Rain (mm) Total Rain (mm)_lag1 Total Rain (mm)_lag2 Total Rain (mm)_lag3 Total Snow (cm) Total Snow (cm)_lag1 Total Snow (cm)_lag2 Total Snow (cm)_lag3 Total Precip (mm) Total Precip (mm)_lag1 Total Precip (mm)_lag2 Total Precip (mm)_lag3 Snow on Grnd (cm) Snow on Grnd (cm)_lag1 Snow on Grnd (cm)_lag2 Snow on Grnd (cm)_lag3 Sale Sale_lag1 Sale_lag2 Sale_lag3
25 0.137 -6.090 -4.499 -3.705 -0.777 -6.578 -4.695 -3.819 0.390 0.400 0.409 0.422 0.354 0.365 0.376 0.391 1.799 -6.017 -4.497 -3.707 1.220 -6.257 -4.570 -3.746 0.200 -4.123 -3.478 -3.048 0.103 -4.663 -3.803 -3.271 1.325 -1.647 -1.647 -1.572 0.174 -3.077 -2.801 -2.557 -1.879 -1.775 -1.682 -1.596 0.198 -4.648 -3.798 -3.267 -0.104 -4.020 -3.473 -3.099 0.094 -4.633 -3.784 -3.259 0.000 0.000 0.000 0.000 -0.146 -0.146 -0.146 -0.146 -0.588 -0.565 -0.565 -0.535 0.047 -4.455 -3.683 -3.192 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 -1.455 -2.916 -2.640 -2.422 -3.028 -2.727 -2.491 -2.298 -2.631 -2.413 -2.233 -2.080 -2.055 -3.711 -3.212 -2.857 -2.806 -3.856 -3.309 -2.929 -1.388 0.401 0.389 0.378 0.352 0.338 0.324 0.308 0.279 0.262 0.245 0.227 -2.223 0.419 0.405 0.393 -2.497 -0.275 -0.274 -0.275 0.603 1.120 1.081 1.043 1.095 1.055 1.016 0.977 1.052 1.011 0.971 0.931 0.404 1.130 1.084 1.040 -0.799 -0.132 -0.140 -0.151 -0.892 -1.397 -1.337 -1.282 -1.722 -1.634 -1.555 -1.482 -1.718 -1.630 -1.551 -1.478 -1.778 -2.479 -2.286 -2.125 -2.291 -2.697 -2.461 -2.271 1.425 1.373 1.326 1.283 1.520 1.479 1.444 1.379 1.735 1.654 1.572 1.496 1.889 1.795 1.705 1.620 1.143 1.111 1.073 1.031 -1.727 -0.816 -0.800 -0.784 -1.088 -1.063 -1.038 -1.006 -1.215 -1.177 -1.136 -1.095 -2.260 -1.071 -1.034 -0.997 -2.713 -1.652 -1.575 -1.503 -1.752 0.056 0.049 0.042 -0.043 -0.052 -0.060 -0.063 -0.133 -0.138 -0.139 -0.137 -2.218 0.066 0.066 0.069 -2.757 -0.799 -0.779 -0.758 -0.188 -0.558 -0.561 -0.564 -0.651 -0.654 -0.658 -0.654 -0.678 -0.676 -0.671 -0.662 -0.503 -0.975 -0.966 -0.955 -1.881 -2.145 -2.025 -1.916 0.526 0.526 0.526 0.526 0.471 0.438 0.404 0.399 0.332 0.320 0.310 0.302 0.988 0.953 0.917 0.880 0.747 0.726 0.698 0.665 -0.406 -0.406 -0.406 -0.406 -0.313 -0.313 -0.313 -0.371 -0.167 -0.235 -0.274 -0.296 -0.421 -0.497 -0.559 -0.599 -0.629 -0.683 -0.713 -0.719 1.560 -6.803 -4.777 -3.863 -4.780 -3.864 -3.311 -2.929 -3.311 -2.929 -2.643 -2.418 0.186 -6.839 -4.790 -3.870 -4.791 -3.871 -3.315 -2.931 -3.316 -2.932 -2.645 -2.420 0.679 -3.491 -3.069 -2.757 1.216 -0.992 -0.952 -0.913 0.771 -1.572 -1.497 -1.428 -2.166 -1.163 -1.163 -1.204 0.608 -0.418 -0.385 -0.355 0.841 -0.584 -0.597 -0.595 -0.226 -0.226 -0.226 -0.226 0.751 -0.641 -0.655 -0.652 -0.309 -0.309 -0.309 -0.309 -0.443 -0.443 -0.443 -0.443
26 0.279 0.185 -4.499 -3.705 -0.465 -0.086 -4.695 -3.819 0.390 0.400 0.409 0.422 0.354 0.365 0.376 0.391 1.428 1.019 -4.497 -3.707 0.555 0.628 -4.570 -3.746 0.323 0.240 -3.478 -3.048 0.251 0.161 -3.803 -3.271 1.130 1.313 -1.647 -1.572 0.585 0.186 -2.801 -2.557 -1.879 -1.775 -1.682 -1.596 0.310 0.228 -3.798 -3.267 0.794 -0.041 -3.473 -3.099 0.242 0.156 -3.784 -3.259 0.000 0.000 0.000 0.000 -0.146 -0.146 -0.146 -0.146 -0.588 -0.565 -0.565 -0.535 0.177 0.112 -3.683 -3.192 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 -1.455 -1.262 -2.640 -2.422 -3.028 -2.727 -2.491 -2.298 -2.631 -2.413 -2.233 -2.080 -2.055 -1.654 -3.212 -2.857 -2.175 -2.245 -3.309 -2.929 -1.388 -1.397 0.389 0.378 0.352 0.338 0.324 0.308 0.279 0.262 0.245 0.227 -2.223 -2.234 0.405 0.393 -2.139 -2.496 -0.274 -0.275 0.603 0.569 1.081 1.043 1.095 1.055 1.016 0.977 1.052 1.011 0.971 0.931 0.404 0.359 1.084 1.040 -0.107 -0.807 -0.140 -0.151 -0.892 -0.839 -1.337 -1.282 -1.722 -1.634 -1.555 -1.482 -1.718 -1.630 -1.551 -1.478 -1.778 -1.603 -2.286 -2.125 -1.752 -2.046 -2.461 -2.271 1.425 1.373 1.326 1.283 1.520 1.479 1.444 1.379 1.735 1.654 1.572 1.496 1.889 1.795 1.705 1.620 1.143 1.111 1.073 1.031 -1.727 -1.704 -0.800 -0.784 -1.088 -1.063 -1.038 -1.006 -1.215 -1.177 -1.136 -1.095 -2.260 -2.209 -1.034 -0.997 -2.296 -2.602 -1.575 -1.503 -1.752 -1.760 0.049 0.042 -0.043 -0.052 -0.060 -0.063 -0.133 -0.138 -0.139 -0.137 -2.218 -2.219 0.066 0.069 -2.298 -2.725 -0.779 -0.758 -0.188 -0.191 -0.561 -0.564 -0.651 -0.654 -0.658 -0.654 -0.678 -0.676 -0.671 -0.662 -0.503 -0.499 -0.966 -0.955 -1.059 -1.764 -2.025 -1.916 0.526 0.526 0.526 0.526 0.471 0.438 0.404 0.399 0.332 0.320 0.310 0.302 0.988 0.953 0.917 0.880 0.747 0.726 0.698 0.665 -0.406 -0.406 -0.406 -0.406 -0.313 -0.313 -0.313 -0.371 -0.167 -0.235 -0.274 -0.296 -0.421 -0.497 -0.559 -0.599 -0.629 -0.683 -0.713 -0.719 1.560 0.340 -4.777 -3.863 -4.780 -3.864 -3.311 -2.929 -3.311 -2.929 -2.643 -2.418 0.186 0.159 -4.790 -3.870 -4.791 -3.871 -3.315 -2.931 -3.316 -2.932 -2.645 -2.420 0.916 0.683 -3.069 -2.757 1.102 1.271 -0.952 -0.913 0.887 0.811 -1.497 -1.428 -1.163 -2.166 -1.163 -1.204 1.556 0.678 -0.385 -0.355 -0.261 0.841 -0.597 -0.595 -0.226 -0.226 -0.226 -0.226 -0.326 0.751 -0.655 -0.652 -0.309 -0.309 -0.309 -0.309 0.999 -0.443 -0.443 -0.443
27 0.157 0.251 0.236 -3.705 -0.588 0.002 0.036 -3.819 0.390 0.400 0.409 0.422 0.354 0.365 0.376 0.391 1.514 0.831 0.857 -3.707 1.205 0.356 0.564 -3.746 0.278 0.338 0.281 -3.048 0.228 0.271 0.212 -3.271 0.314 1.123 1.313 -1.572 0.698 0.574 0.211 -2.557 -1.879 -1.775 -1.682 -1.596 0.315 0.312 0.266 -3.267 0.564 0.723 0.015 -3.099 0.129 0.266 0.209 -3.259 0.000 0.000 0.000 0.000 -0.146 -0.146 -0.146 -0.146 -0.588 -0.565 -0.565 -0.535 -0.402 0.212 0.170 -3.192 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 -1.455 -1.262 -1.113 -2.422 -1.326 -2.727 -2.491 -2.298 -2.631 -2.413 -2.233 -2.080 -2.055 -1.654 -1.394 -2.857 -1.903 -1.723 -1.898 -2.929 -1.388 -1.397 -1.407 0.378 -1.892 0.338 0.324 0.308 0.279 0.262 0.245 0.227 -2.223 -2.234 -2.244 0.393 -1.984 -2.138 -2.496 -0.275 0.603 0.569 0.537 1.043 0.502 1.055 1.016 0.977 1.052 1.011 0.971 0.931 0.404 0.359 0.314 1.040 0.192 -0.113 -0.816 -0.151 -0.892 -0.839 -0.789 -1.282 -1.079 -1.634 -1.555 -1.482 -1.718 -1.630 -1.551 -1.478 -1.778 -1.603 -1.460 -2.125 -1.520 -1.551 -1.854 -2.271 1.425 1.373 1.326 1.283 1.520 1.479 1.444 1.379 1.735 1.654 1.572 1.496 1.889 1.795 1.705 1.620 1.143 1.111 1.073 1.031 -1.727 -1.704 -1.682 -0.784 -2.169 -1.063 -1.038 -1.006 -1.215 -1.177 -1.136 -1.095 -2.260 -2.209 -2.159 -0.997 -2.116 -2.197 -2.500 -1.503 -1.752 -1.760 -1.769 0.042 -2.280 -0.052 -0.060 -0.063 -0.133 -0.138 -0.139 -0.137 -2.218 -2.219 -2.219 0.069 -2.100 -2.270 -2.693 -0.758 -0.188 -0.191 -0.193 -0.564 -0.262 -0.654 -0.658 -0.654 -0.678 -0.676 -0.671 -0.662 -0.503 -0.499 -0.493 -0.955 -0.704 -0.980 -1.660 -1.916 0.526 0.526 0.526 0.526 0.471 0.438 0.404 0.399 0.332 0.320 0.310 0.302 0.988 0.953 0.917 0.880 0.747 0.726 0.698 0.665 -0.406 -0.406 -0.406 -0.406 -0.313 -0.313 -0.313 -0.371 -0.167 -0.235 -0.274 -0.296 -0.421 -0.497 -0.559 -0.599 -0.629 -0.683 -0.713 -0.719 1.560 0.340 0.349 -3.863 0.350 -3.864 -3.311 -2.929 -3.311 -2.929 -2.643 -2.418 0.186 0.159 0.218 -3.870 0.218 -3.871 -3.315 -2.931 -3.316 -2.932 -2.645 -2.420 0.849 0.889 0.700 -2.757 1.348 1.157 1.331 -0.913 0.882 0.924 0.851 -1.428 -1.163 -1.163 -2.166 -1.204 1.526 1.659 0.753 -0.355 -0.510 -0.261 0.838 -0.595 -0.226 -0.226 -0.226 -0.226 -0.569 -0.326 0.748 -0.652 -0.309 -0.309 -0.309 -0.309 -0.443 0.999 -0.443 -0.443
28 -0.112 0.195 0.286 0.281 -0.619 -0.033 0.100 0.110 0.390 0.400 0.409 0.422 0.354 0.365 0.376 0.391 1.639 0.874 0.714 0.798 0.488 0.622 0.360 0.557 0.022 0.302 0.366 0.320 -0.116 0.254 0.303 0.258 -0.698 0.327 1.123 1.310 0.436 0.679 0.569 0.242 -1.879 -1.775 -1.682 -1.596 -0.017 0.316 0.335 0.307 -0.313 0.527 0.684 0.057 -0.168 0.182 0.301 0.254 0.000 0.000 0.000 0.000 -0.146 -0.146 -0.146 -0.146 -0.588 -0.565 -0.565 -0.535 -0.466 -0.235 0.255 0.230 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 -1.455 -1.262 -1.113 -0.993 -1.326 -1.164 -2.491 -2.298 -2.631 -2.413 -2.233 -2.080 -2.055 -1.654 -1.394 -1.205 -1.763 -1.497 -1.440 -1.653 -1.388 -1.397 -1.407 -1.416 -1.892 -1.906 0.324 0.308 0.279 0.262 0.245 0.227 -2.223 -2.234 -2.244 -2.253 -1.905 -1.984 -2.138 -2.496 0.603 0.569 0.537 0.505 0.502 0.468 1.016 0.977 1.052 1.011 0.971 0.931 0.404 0.359 0.314 0.268 0.346 0.186 -0.121 -0.830 -0.892 -0.839 -0.789 -0.743 -1.079 -1.009 -1.555 -1.482 -1.718 -1.630 -1.551 -1.478 -1.778 -1.603 -1.460 -1.340 -1.400 -1.337 -1.392 -1.697 1.425 1.373 1.326 1.283 1.520 1.479 1.444 1.379 1.735 1.654 1.572 1.496 1.889 1.795 1.705 1.620 1.143 1.111 1.073 1.031 -1.727 -1.704 -1.682 -1.661 -2.169 -2.133 -1.038 -1.006 -1.215 -1.177 -1.136 -1.095 -2.260 -2.209 -2.159 -2.111 -2.023 -2.022 -2.106 -2.406 -1.752 -1.760 -1.769 -1.777 -2.280 -2.294 -0.060 -0.063 -0.133 -0.138 -0.139 -0.137 -2.218 -2.219 -2.219 -2.217 -1.998 -2.073 -2.240 -2.659 -0.188 -0.191 -0.193 -0.196 -0.262 -0.265 -0.658 -0.654 -0.678 -0.676 -0.671 -0.662 -0.503 -0.499 -0.493 -0.484 -0.522 -0.641 -0.908 -1.565 0.526 0.526 0.526 0.526 0.471 0.438 0.404 0.399 0.332 0.320 0.310 0.302 0.988 0.953 0.917 0.880 0.747 0.726 0.698 0.665 -0.406 -0.406 -0.406 -0.406 -0.313 -0.313 -0.313 -0.371 -0.167 -0.235 -0.274 -0.296 -0.421 -0.497 -0.559 -0.599 -0.629 -0.683 -0.713 -0.719 1.560 0.340 0.349 0.375 0.350 0.376 -3.311 -2.929 -3.311 -2.929 -2.643 -2.418 0.186 0.159 0.218 0.267 0.218 0.266 -3.315 -2.931 -3.316 -2.932 -2.645 -2.420 0.937 0.830 0.886 0.722 1.435 1.404 1.215 1.373 0.934 0.920 0.962 0.888 -1.163 -1.163 -1.163 -2.228 1.844 1.628 1.770 0.803 0.887 -0.510 -0.272 0.839 -0.226 -0.226 -0.226 -0.226 0.796 -0.569 -0.337 0.749 -0.309 -0.309 -0.309 -0.309 -0.443 -0.443 0.999 -0.443
29 -6.479 0.070 0.244 0.323 -5.201 -0.041 0.074 0.164 0.390 0.400 0.409 0.422 0.354 0.365 0.376 0.391 3.181 0.938 0.747 0.677 -4.506 0.328 0.559 0.387 -5.245 0.097 0.335 0.396 -6.421 -0.001 0.289 0.338 -1.709 -0.660 0.327 1.125 -3.284 0.433 0.667 0.575 -1.879 -1.775 -1.682 -1.596 -6.340 0.068 0.338 0.368 -4.779 -0.219 0.513 0.663 -6.360 -0.039 0.231 0.335 0.000 0.000 0.000 0.000 -0.146 -0.146 -0.146 -0.146 -0.588 -0.565 -0.565 -0.535 -5.874 -0.284 -0.122 0.305 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 -3.264 -1.262 -1.113 -0.993 -1.670 -1.164 -1.036 -2.298 -1.291 -2.413 -2.233 -2.080 -2.368 -1.654 -1.394 -1.205 -2.224 -1.381 -1.243 -1.239 -3.850 -1.397 -1.407 -1.416 -2.197 -1.906 -1.920 0.308 -2.380 0.262 0.245 0.227 -2.447 -2.234 -2.244 -2.253 -2.283 -1.904 -1.984 -2.138 -0.502 0.569 0.537 0.505 0.215 0.468 0.436 0.977 0.271 1.011 0.971 0.931 0.163 0.359 0.314 0.268 -0.009 0.341 0.179 -0.132 -2.890 -0.839 -0.789 -0.743 -1.443 -1.009 -0.946 -1.482 -1.274 -1.630 -1.551 -1.478 -2.111 -1.603 -1.460 -1.340 -1.889 -1.227 -1.192 -1.261 1.425 1.373 1.326 1.283 1.520 1.479 1.444 1.379 1.735 1.654 1.572 1.496 1.889 1.795 1.705 1.620 1.143 1.111 1.073 1.031 -3.791 -1.704 -1.682 -1.661 -2.529 -2.133 -2.098 -1.006 -2.602 -1.177 -1.136 -1.095 -2.499 -2.209 -2.159 -2.111 -2.425 -1.932 -1.936 -2.021 -3.772 -1.760 -1.769 -1.777 -2.676 -2.294 -2.308 -0.063 -2.897 -0.138 -0.139 -0.137 -2.471 -2.219 -2.219 -2.217 -2.418 -1.972 -2.045 -2.210 -1.660 -0.191 -0.193 -0.196 -0.642 -0.265 -0.268 -0.654 -0.502 -0.676 -0.671 -0.662 -0.803 -0.499 -0.493 -0.484 -0.947 -0.466 -0.583 -0.841 0.526 0.526 0.526 0.526 0.471 0.438 0.404 0.399 0.332 0.320 0.310 0.302 0.988 0.953 0.917 0.880 0.747 0.726 0.698 0.665 -0.406 -0.406 -0.406 -0.406 -0.313 -0.313 -0.313 -0.371 -0.167 -0.235 -0.274 -0.296 -0.421 -0.497 -0.559 -0.599 -0.629 -0.683 -0.713 -0.719 2.912 0.340 0.349 0.375 0.391 0.376 0.405 -2.929 0.423 -2.929 -2.643 -2.418 0.530 0.159 0.218 0.267 0.223 0.266 0.309 -2.931 0.311 -2.932 -2.645 -2.420 0.705 0.908 0.833 0.894 1.378 1.492 1.465 1.257 0.831 0.970 0.958 0.999 -1.163 -1.163 -1.163 -1.204 1.128 1.957 1.738 1.839 -1.591 0.887 -0.523 -0.270 -0.226 -0.226 -0.226 -0.226 -1.625 0.796 -0.582 -0.335 -0.309 -0.309 -0.309 -0.309 -0.443 -0.443 -0.443 0.999

Splitting the 3 Dataframes into Training and Testing Sets

As you know, the first basic step for regression is performing a train-test split.

In [384]:
from sklearn.model_selection import train_test_split

# We specify this so that the train and test data set always have the same rows, respectively

cameraaccessory_dlmul_train, cameraaccessory_dlmul_test = train_test_split(cameraaccessory_dlmul_df, \
                                                               train_size = 0.7, test_size = 0.3, random_state = 100)

gamingaccessory_dlmul_train, gamingaccessory_dlmul_test = train_test_split(gamingaccessory_dlmul_df, \
                                                               train_size = 0.7, test_size = 0.3, random_state = 100)

homeaudio_dlmul_train, homeaudio_dlmul_test = train_test_split(homeaudio_dlmul_df, \
                                                               train_size = 0.7, test_size = 0.3, random_state = 100)

Dividing the 3 dataframes into X and Y sets for the model building

In [385]:
y_cameraaccessory_dlmul_train = cameraaccessory_dlmul_train.pop('gmv')
X_cameraaccessory_dlmul_train = cameraaccessory_dlmul_train

y_gamingaccessory_dlmul_train = gamingaccessory_dlmul_train.pop('gmv')
X_gamingaccessory_dlmul_train = gamingaccessory_dlmul_train

y_homeaudio_dlmul_train = homeaudio_dlmul_train.pop('gmv')
X_homeaudio_dlmul_train = homeaudio_dlmul_train

X_homeaudio_dlmul_train.head()
Out[385]:
gmv_lag1 gmv_lag2 gmv_lag3 Discount% Discount%_lag1 Discount%_lag2 Discount%_lag3 deliverybdays deliverybdays_lag1 deliverybdays_lag2 deliverybdays_lag3 deliverycdays deliverycdays_lag1 deliverycdays_lag2 deliverycdays_lag3 sla sla_lag1 sla_lag2 sla_lag3 product_procurement_sla product_procurement_sla_lag1 product_procurement_sla_lag2 product_procurement_sla_lag3 is_cod is_cod_lag1 is_cod_lag2 is_cod_lag3 is_mass_market is_mass_market_lag1 is_mass_market_lag2 is_mass_market_lag3 product_vertical_djcontroller product_vertical_djcontroller_lag1 product_vertical_djcontroller_lag2 product_vertical_djcontroller_lag3 product_vertical_dock product_vertical_dock_lag1 product_vertical_dock_lag2 product_vertical_dock_lag3 product_vertical_dockingstation product_vertical_dockingstation_lag1 product_vertical_dockingstation_lag2 product_vertical_dockingstation_lag3 product_vertical_fmradio product_vertical_fmradio_lag1 product_vertical_fmradio_lag2 product_vertical_fmradio_lag3 product_vertical_hifisystem product_vertical_hifisystem_lag1 product_vertical_hifisystem_lag2 product_vertical_hifisystem_lag3 product_vertical_homeaudiospeaker product_vertical_homeaudiospeaker_lag1 product_vertical_homeaudiospeaker_lag2 product_vertical_homeaudiospeaker_lag3 product_vertical_karaokeplayer product_vertical_karaokeplayer_lag1 product_vertical_karaokeplayer_lag2 product_vertical_karaokeplayer_lag3 product_vertical_slingbox product_vertical_slingbox_lag1 product_vertical_slingbox_lag2 product_vertical_slingbox_lag3 product_vertical_soundmixer product_vertical_soundmixer_lag1 product_vertical_soundmixer_lag2 product_vertical_soundmixer_lag3 product_vertical_voicerecorder product_vertical_voicerecorder_lag1 product_vertical_voicerecorder_lag2 product_vertical_voicerecorder_lag3 payday_week payday_week_lag1 payday_week_lag2 payday_week_lag3 holiday_week holiday_week_lag1 holiday_week_lag2 holiday_week_lag3 Total Investment Total Investment_lag1 Total Investment_lag2 Total Investment_lag3 Total Investment_SMA_3 Total Investment_SMA_3_lag1 Total Investment_SMA_3_lag2 Total Investment_SMA_3_lag3 Total Investment_SMA_5 Total Investment_SMA_5_lag1 Total Investment_SMA_5_lag2 Total Investment_SMA_5_lag3 Total Investment_EMA_8 Total Investment_EMA_8_lag1 Total Investment_EMA_8_lag2 Total Investment_EMA_8_lag3 Total_Investment_Ad_Stock Total_Investment_Ad_Stock_lag1 Total_Investment_Ad_Stock_lag2 Total_Investment_Ad_Stock_lag3 TV TV_lag1 TV_lag2 TV_lag3 TV_SMA_3 TV_SMA_3_lag1 TV_SMA_3_lag2 TV_SMA_3_lag3 TV_SMA_5 TV_SMA_5_lag1 TV_SMA_5_lag2 TV_SMA_5_lag3 TV_EMA_8 TV_EMA_8_lag1 TV_EMA_8_lag2 TV_EMA_8_lag3 TV_Ad_Stock TV_Ad_Stock_lag1 TV_Ad_Stock_lag2 TV_Ad_Stock_lag3 Digital Digital_lag1 Digital_lag2 Digital_lag3 Digital_SMA_3 Digital_SMA_3_lag1 Digital_SMA_3_lag2 Digital_SMA_3_lag3 Digital_SMA_5 Digital_SMA_5_lag1 Digital_SMA_5_lag2 Digital_SMA_5_lag3 Digital_EMA_8 Digital_EMA_8_lag1 Digital_EMA_8_lag2 Digital_EMA_8_lag3 Digital_Ad_Stock Digital_Ad_Stock_lag1 Digital_Ad_Stock_lag2 Digital_Ad_Stock_lag3 Sponsorship Sponsorship_lag1 Sponsorship_lag2 Sponsorship_lag3 Sponsorship_SMA_3 Sponsorship_SMA_3_lag1 Sponsorship_SMA_3_lag2 Sponsorship_SMA_3_lag3 Sponsorship_SMA_5 Sponsorship_SMA_5_lag1 Sponsorship_SMA_5_lag2 Sponsorship_SMA_5_lag3 Sponsorship_EMA_8 Sponsorship_EMA_8_lag1 Sponsorship_EMA_8_lag2 Sponsorship_EMA_8_lag3 Sponsorship_Ad_Stock Sponsorship_Ad_Stock_lag1 Sponsorship_Ad_Stock_lag2 Sponsorship_Ad_Stock_lag3 Content Marketing Content Marketing_lag1 Content Marketing_lag2 Content Marketing_lag3 Content Marketing_SMA_3 Content Marketing_SMA_3_lag1 Content Marketing_SMA_3_lag2 Content Marketing_SMA_3_lag3 Content Marketing_SMA_5 Content Marketing_SMA_5_lag1 Content Marketing_SMA_5_lag2 Content Marketing_SMA_5_lag3 Content Marketing_EMA_8 Content Marketing_EMA_8_lag1 Content Marketing_EMA_8_lag2 Content Marketing_EMA_8_lag3 Content_Marketing_Ad_Stock Content_Marketing_Ad_Stock_lag1 Content_Marketing_Ad_Stock_lag2 Content_Marketing_Ad_Stock_lag3 Online marketing Online marketing_lag1 Online marketing_lag2 Online marketing_lag3 Online marketing_SMA_3 Online marketing_SMA_3_lag1 Online marketing_SMA_3_lag2 Online marketing_SMA_3_lag3 Online marketing_SMA_5 Online marketing_SMA_5_lag1 Online marketing_SMA_5_lag2 Online marketing_SMA_5_lag3 Online marketing_EMA_8 Online marketing_EMA_8_lag1 Online marketing_EMA_8_lag2 Online marketing_EMA_8_lag3 Online_marketing_Ad_Stock Online_marketing_Ad_Stock_lag1 Online_marketing_Ad_Stock_lag2 Online_marketing_Ad_Stock_lag3 Affiliates Affiliates_lag1 Affiliates_lag2 Affiliates_lag3 Affiliates_SMA_3 Affiliates_SMA_3_lag1 Affiliates_SMA_3_lag2 Affiliates_SMA_3_lag3 Affiliates_SMA_5 Affiliates_SMA_5_lag1 Affiliates_SMA_5_lag2 Affiliates_SMA_5_lag3 Affiliates_EMA_8 Affiliates_EMA_8_lag1 Affiliates_EMA_8_lag2 Affiliates_EMA_8_lag3 Affiliates_Ad_Stock Affiliates_Ad_Stock_lag1 Affiliates_Ad_Stock_lag2 Affiliates_Ad_Stock_lag3 SEM SEM_lag1 SEM_lag2 SEM_lag3 SEM_SMA_3 SEM_SMA_3_lag1 SEM_SMA_3_lag2 SEM_SMA_3_lag3 SEM_SMA_5 SEM_SMA_5_lag1 SEM_SMA_5_lag2 SEM_SMA_5_lag3 SEM_EMA_8 SEM_EMA_8_lag1 SEM_EMA_8_lag2 SEM_EMA_8_lag3 SEM_Ad_Stock SEM_Ad_Stock_lag1 SEM_Ad_Stock_lag2 SEM_Ad_Stock_lag3 Radio Radio_lag1 Radio_lag2 Radio_lag3 Radio_SMA_3 Radio_SMA_3_lag1 Radio_SMA_3_lag2 Radio_SMA_3_lag3 Radio_SMA_5 Radio_SMA_5_lag1 Radio_SMA_5_lag2 Radio_SMA_5_lag3 Radio_EMA_8 Radio_EMA_8_lag1 Radio_EMA_8_lag2 Radio_EMA_8_lag3 Radio_Ad_Stock Radio_Ad_Stock_lag1 Radio_Ad_Stock_lag2 Radio_Ad_Stock_lag3 Other Other_lag1 Other_lag2 Other_lag3 Other_SMA_3 Other_SMA_3_lag1 Other_SMA_3_lag2 Other_SMA_3_lag3 Other_SMA_5 Other_SMA_5_lag1 Other_SMA_5_lag2 Other_SMA_5_lag3 Other_EMA_8 Other_EMA_8_lag1 Other_EMA_8_lag2 Other_EMA_8_lag3 Other_Ad_Stock Other_Ad_Stock_lag1 Other_Ad_Stock_lag2 Other_Ad_Stock_lag3 NPS NPS_lag1 NPS_lag2 NPS_lag3 NPS_SMA_3 NPS_SMA_3_lag1 NPS_SMA_3_lag2 NPS_SMA_3_lag3 NPS_SMA_5 NPS_SMA_5_lag1 NPS_SMA_5_lag2 NPS_SMA_5_lag3 Stock Index Stock Index_lag1 Stock Index_lag2 Stock Index_lag3 Stock Index_SMA_3 Stock Index_SMA_3_lag1 Stock Index_SMA_3_lag2 Stock Index_SMA_3_lag3 Stock Index_SMA_5 Stock Index_SMA_5_lag1 Stock Index_SMA_5_lag2 Stock Index_SMA_5_lag3 Max Temp Max Temp_lag1 Max Temp_lag2 Max Temp_lag3 Min Temp Min Temp_lag1 Min Temp_lag2 Min Temp_lag3 Mean Temp Mean Temp_lag1 Mean Temp_lag2 Mean Temp_lag3 Heat Deg Days Heat Deg Days_lag1 Heat Deg Days_lag2 Heat Deg Days_lag3 Cool Deg Days Cool Deg Days_lag1 Cool Deg Days_lag2 Cool Deg Days_lag3 Total Rain (mm) Total Rain (mm)_lag1 Total Rain (mm)_lag2 Total Rain (mm)_lag3 Total Snow (cm) Total Snow (cm)_lag1 Total Snow (cm)_lag2 Total Snow (cm)_lag3 Total Precip (mm) Total Precip (mm)_lag1 Total Precip (mm)_lag2 Total Precip (mm)_lag3 Snow on Grnd (cm) Snow on Grnd (cm)_lag1 Snow on Grnd (cm)_lag2 Snow on Grnd (cm)_lag3 Sale Sale_lag1 Sale_lag2 Sale_lag3
12 0.101 0.170 0.291 -0.301 0.119 0.167 0.269 0.873 0.596 0.098 -0.507 0.920 0.639 0.157 -0.508 0.231 -0.042 0.148 0.081 0.240 0.175 0.081 0.112 -1.952 -0.115 0.201 0.387 -0.056 0.064 0.212 0.291 0.639 -0.083 1.123 -1.572 0.080 -0.236 -0.138 0.166 1.001 0.723 0.026 0.988 0.103 0.156 0.298 0.305 0.644 -0.219 0.543 0.057 -0.162 0.043 0.136 0.320 0.000 0.000 0.000 0.000 6.856 -0.146 -0.146 -0.146 0.450 0.471 -0.565 1.207 0.736 0.643 0.459 0.477 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.060 0.755 0.749 0.749 0.566 0.741 0.740 0.529 0.633 0.602 0.478 0.347 0.472 0.572 0.491 0.405 0.468 0.650 0.583 0.501 0.575 0.921 0.908 0.896 0.878 0.986 0.972 0.726 0.915 0.835 0.647 0.411 0.879 0.915 0.807 0.656 0.861 0.986 0.929 0.823 -0.699 0.344 0.314 0.285 -0.022 0.229 0.199 0.046 0.024 0.083 -0.030 -0.153 -0.316 -0.142 -0.244 -0.368 -0.163 0.211 0.164 0.084 0.205 0.721 0.741 0.762 0.563 0.736 0.755 0.487 0.652 0.589 0.411 0.192 0.318 0.420 0.307 0.147 0.440 0.624 0.564 0.452 -1.764 -0.249 -0.317 -0.389 -0.633 -0.434 -0.525 -0.494 -0.772 -0.585 -0.578 -0.565 -0.800 -0.603 -0.599 -0.578 -1.068 -0.616 -0.650 -0.644 0.316 0.414 0.423 0.432 0.375 0.420 0.429 0.402 0.386 0.394 0.383 0.376 0.452 0.491 0.505 0.521 0.391 0.443 0.456 0.466 0.385 0.459 0.452 0.446 0.421 0.446 0.438 0.371 0.404 0.377 0.331 0.287 0.523 0.536 0.526 0.515 0.461 0.497 0.497 0.490 -0.469 -0.138 -0.140 -0.143 -0.316 -0.209 -0.212 -0.356 -0.288 -0.310 -0.399 -0.490 -0.591 -0.513 -0.531 -0.553 -0.392 -0.231 -0.223 -0.238 0.526 -2.063 -2.063 -2.063 0.320 0.311 0.267 0.226 0.006 -0.007 -0.070 -0.153 -0.944 -0.769 -0.895 -1.036 -0.735 -0.199 -0.377 -0.639 -0.406 1.990 1.990 1.990 1.351 2.041 2.041 1.382 1.565 1.633 1.176 0.490 1.358 1.979 1.798 1.460 1.292 1.926 1.835 1.627 0.812 0.147 0.211 0.261 0.236 0.261 0.305 0.352 0.315 0.348 0.389 0.427 0.944 0.013 0.114 0.180 0.164 0.180 0.233 0.310 0.255 0.297 0.354 0.406 -0.603 -0.018 0.347 0.138 -1.033 -0.992 -0.952 -0.913 -1.473 -0.042 -0.133 0.260 1.077 0.752 0.830 0.591 -0.451 -0.418 -0.385 -0.355 0.236 0.413 1.866 -1.264 -2.093 -2.134 -0.226 -0.226 0.357 0.498 1.753 -1.306 -0.309 -0.309 -0.309 -0.309 -0.443 -0.443 -0.443 -0.443
10 0.201 0.358 0.302 -0.142 0.177 0.369 0.209 0.082 -0.519 -0.978 -1.538 0.137 -0.522 -0.998 -1.453 -0.030 -0.100 -0.019 0.257 -0.359 -0.083 0.246 0.295 0.084 0.326 0.485 0.408 0.103 0.207 0.422 0.337 1.130 -1.647 -0.660 0.659 -0.227 0.097 0.398 0.508 -0.072 0.920 0.955 0.078 0.250 0.225 0.354 0.365 0.605 -0.041 0.736 0.858 -0.025 0.245 0.491 0.325 0.000 0.000 0.000 0.000 -0.146 -0.146 -0.146 -0.146 -0.588 1.077 0.471 -0.535 0.492 0.442 0.512 0.448 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.771 0.755 0.749 -0.193 0.749 0.493 0.209 -0.158 0.432 0.280 0.113 -0.089 0.481 0.350 0.213 0.031 0.617 0.475 0.294 -0.065 0.934 0.921 0.908 -0.028 1.000 0.755 0.410 -0.200 0.678 0.445 0.135 -0.338 0.836 0.683 0.454 0.060 0.929 0.823 0.604 -0.013 0.375 0.344 0.314 -0.086 0.259 0.107 -0.053 -0.233 0.037 -0.078 -0.201 -0.335 -0.158 -0.276 -0.423 -0.616 0.177 0.102 -0.030 -0.274 0.701 0.721 0.741 -0.538 0.718 0.434 0.039 -0.661 0.353 0.123 -0.182 -0.649 0.233 0.054 -0.217 -0.699 0.525 0.392 0.131 -0.621 -0.183 -0.249 -0.317 -0.215 -0.351 -0.363 -0.384 -0.365 -0.475 -0.488 -0.487 -0.481 -0.485 -0.477 -0.450 -0.403 -0.497 -0.521 -0.496 -0.405 0.405 0.414 0.423 0.315 0.410 0.374 0.337 0.306 0.349 0.333 0.322 0.313 0.468 0.480 0.491 0.503 0.416 0.421 0.420 0.405 0.466 0.459 0.452 0.278 0.453 0.381 0.304 0.228 0.337 0.286 0.237 0.189 0.527 0.512 0.494 0.473 0.475 0.465 0.441 0.394 -0.135 -0.138 -0.140 -0.605 -0.206 -0.357 -0.522 -0.697 -0.405 -0.503 -0.603 -0.706 -0.542 -0.569 -0.604 -0.648 -0.311 -0.325 -0.381 -0.510 -2.063 -2.063 -2.063 0.526 0.352 0.278 0.169 -2.932 -0.048 -0.136 -0.275 -6.723 -0.751 -0.905 -1.088 -1.320 -0.213 -0.492 -0.994 -2.340 1.990 1.990 1.990 -0.406 2.041 1.351 0.171 -0.371 1.135 0.517 -0.651 -0.296 1.674 1.409 0.892 -0.107 1.777 1.613 1.170 -0.552 0.013 0.147 0.211 0.288 0.211 0.271 0.321 0.366 0.314 0.357 0.397 0.435 -1.901 0.013 0.114 0.289 0.113 0.218 0.298 0.365 0.272 0.332 0.385 0.435 0.230 -0.018 0.319 -0.302 -1.033 -0.992 -0.952 -0.913 -0.244 0.164 0.317 -0.819 0.830 0.592 0.473 1.039 -0.451 -0.418 -0.385 -0.355 1.862 -1.249 1.526 -0.595 -0.226 -0.226 -0.226 2.419 1.749 -1.291 1.421 0.484 -0.309 -0.309 -0.309 2.131 -0.443 -0.443 1.843 -0.443
32 0.149 0.185 -1.661 -0.024 0.154 0.220 -0.651 0.390 0.400 0.409 0.422 0.354 0.365 0.376 0.391 0.655 0.038 0.121 1.249 0.258 0.705 0.437 -0.910 0.140 0.266 0.228 -3.048 0.083 0.227 0.196 -3.271 1.130 1.123 0.644 -1.572 0.624 0.628 0.534 -2.557 -0.323 -0.583 -1.682 -1.596 0.230 0.358 0.277 -3.267 0.388 0.491 0.513 -3.099 -0.097 0.074 0.090 -3.259 0.000 0.000 0.000 0.000 -0.146 -0.146 -0.146 -0.146 -0.588 -0.565 -0.565 -0.535 -0.434 0.389 0.213 -3.192 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.723 0.711 -2.627 -2.409 0.252 -0.397 -2.479 -1.204 -0.205 -0.727 -2.221 -0.949 -0.507 -0.980 -2.382 -1.416 0.185 -0.241 -2.222 -1.271 0.394 0.382 -3.861 -3.867 0.017 -0.529 -4.987 -2.243 -0.522 -1.127 -5.640 -2.459 -0.417 -0.925 -3.343 -2.476 0.062 -0.292 -3.677 -2.282 -0.159 -0.183 -0.541 -0.560 -0.418 -0.561 -0.712 0.122 -0.607 -0.712 -0.827 0.160 -0.583 -0.603 -0.621 0.028 -0.467 -0.542 -0.673 -0.033 1.081 1.093 -2.710 -2.632 0.732 0.068 -3.080 -1.224 0.234 -0.408 -3.090 -1.067 0.157 -0.506 -2.802 -1.618 0.708 0.236 -2.675 -1.371 0.154 0.091 1.326 1.283 -0.278 -0.918 1.444 1.379 -0.985 -1.710 1.572 1.496 -1.255 -1.940 1.705 1.620 -0.527 -1.186 1.073 1.031 0.310 0.319 -3.721 -3.688 -0.100 -0.749 -4.544 -2.403 -0.680 -1.360 -4.905 -2.427 -0.561 -1.073 -3.259 -2.343 -0.059 -0.422 -3.382 -2.140 0.274 0.267 -3.792 -3.802 -0.232 -1.004 -4.808 -2.709 -0.977 -1.826 -5.462 -2.904 -0.667 -1.249 -3.383 -2.470 -0.124 -0.549 -3.524 -2.327 0.151 0.149 -1.669 -1.674 -0.337 -0.927 -1.834 -0.645 -0.819 -1.268 -1.868 -0.488 -0.885 -1.189 -1.657 -0.779 -0.373 -0.701 -1.518 -0.742 0.526 0.526 0.526 0.526 0.471 0.438 0.404 0.399 0.332 0.320 0.310 0.302 0.988 0.953 0.917 0.880 0.747 0.726 0.698 0.665 -0.406 -0.406 -0.406 -0.406 -0.313 -0.313 -0.313 -0.371 -0.167 -0.235 -0.274 -0.296 -0.421 -0.497 -0.559 -0.599 -0.629 -0.683 -0.713 -0.719 -0.615 0.069 0.469 0.474 0.269 0.395 0.493 0.462 0.407 0.476 0.538 0.509 -0.754 0.093 0.236 0.281 0.193 0.263 0.321 0.352 0.302 0.350 0.395 0.421 0.653 0.870 0.851 0.741 1.251 0.977 1.635 1.538 0.789 0.776 1.004 0.946 -1.163 -0.967 -1.163 -1.204 0.690 0.748 2.058 1.371 -0.493 -0.623 1.814 -1.608 -0.226 -0.226 -0.226 -0.226 -0.552 -0.679 1.702 -1.643 -0.309 -0.309 -0.309 -0.309 -0.443 -0.443 -0.443 -0.443
22 0.110 0.170 0.248 -1.150 -0.116 0.116 0.263 0.850 0.914 0.885 0.828 0.894 0.968 0.931 0.882 -0.743 0.114 -0.142 -0.381 0.282 -0.051 0.214 0.103 0.057 0.176 0.185 0.256 -0.083 0.067 0.103 0.161 -0.106 -1.647 -0.660 0.911 -0.541 -0.550 -0.311 -1.122 0.813 1.360 1.260 1.351 0.101 0.271 0.068 -0.158 -1.679 -0.284 0.062 0.141 -0.164 -0.036 0.087 0.264 0.000 0.000 0.000 0.000 -0.146 -0.146 -0.146 -0.146 2.326 2.344 0.471 1.664 0.679 0.876 0.155 -0.125 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 -0.296 -0.212 0.253 0.285 -0.080 0.114 0.267 0.301 0.095 0.207 0.304 0.322 -0.071 0.104 0.253 0.297 -0.136 0.065 0.255 0.293 -0.128 -0.139 -0.407 -0.418 -0.416 -0.535 -0.672 -0.689 -0.659 -0.753 -0.858 -0.403 -0.138 -0.096 -0.042 0.078 -0.282 -0.307 -0.351 -0.237 -0.103 -0.128 -1.146 -1.158 -0.519 -0.881 -1.357 -1.381 -0.887 -1.161 -1.497 -1.432 -1.103 -1.324 -1.638 -1.587 -0.623 -0.891 -1.478 -1.447 0.232 0.262 0.302 0.330 0.194 0.231 0.267 0.298 0.218 0.253 0.287 0.310 0.099 0.154 0.204 0.249 0.159 0.210 0.258 0.298 -1.628 -1.706 0.050 -0.016 -0.696 -0.307 -0.086 -0.133 -0.515 -0.325 -0.176 -0.411 -0.928 -0.739 -0.542 -0.668 -1.107 -0.676 -0.173 -0.286 -0.274 -0.261 0.444 0.453 -0.042 0.237 0.454 0.471 0.156 0.321 0.468 0.464 0.223 0.366 0.525 0.538 0.044 0.245 0.490 0.507 -0.273 -0.280 0.331 0.324 -0.143 0.093 0.287 0.285 -0.020 0.121 0.251 0.265 0.189 0.306 0.442 0.451 0.009 0.177 0.403 0.420 -0.665 -0.668 -0.039 -0.041 -0.514 -0.297 -0.104 -0.101 -0.355 -0.228 -0.107 -0.184 -0.668 -0.550 -0.407 -0.431 -0.532 -0.327 -0.068 -0.051 0.526 0.526 -2.048 -2.048 0.267 0.278 0.268 0.263 -0.046 -0.005 0.027 -0.022 -1.228 -1.059 -0.873 -1.009 -1.297 -0.800 -0.208 -0.331 -0.406 -0.406 -0.411 -0.411 -2.188 -1.008 -0.318 -0.376 -0.935 -0.599 -0.279 -0.676 -1.438 -1.017 -0.504 -0.525 -0.804 -0.252 0.350 0.311 0.444 0.201 0.165 0.223 0.222 0.247 0.272 0.314 0.296 0.325 0.353 0.404 0.389 0.173 0.249 0.292 0.235 0.286 0.331 0.368 0.324 0.364 0.402 0.437 0.916 0.962 0.778 0.829 0.872 0.977 1.433 1.125 0.818 0.834 0.901 0.998 -1.843 -1.615 -1.163 -1.204 1.153 1.076 1.231 1.830 -0.493 -0.584 0.790 -1.563 -0.226 -0.226 -0.226 -0.226 -0.552 -0.641 0.701 -1.599 -0.309 -0.309 -0.309 -0.309 -0.443 -0.443 -0.443 1.843
45 0.357 0.228 0.240 0.638 0.633 0.338 0.181 0.390 0.400 0.409 0.422 0.354 0.365 0.376 0.391 0.289 -0.342 0.213 0.492 0.937 0.878 0.458 0.060 0.328 0.545 0.323 0.309 0.046 0.388 0.236 0.229 0.639 -0.083 0.327 -0.049 0.389 0.411 0.635 0.290 0.588 -0.271 0.220 -0.454 0.087 0.242 0.277 0.234 0.341 0.114 -0.087 0.057 0.256 0.541 0.231 0.232 0.000 0.000 0.000 0.000 -0.146 -0.146 -0.146 -0.146 -0.588 -0.565 -0.565 -0.535 -0.071 0.163 0.125 -0.105 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.853 0.829 -0.191 -0.130 0.555 0.241 -0.156 -0.098 0.313 0.131 -0.085 0.398 0.603 0.458 0.315 0.467 0.601 0.385 0.043 0.203 0.598 0.586 0.285 0.273 0.479 0.347 0.193 0.177 0.317 0.214 0.101 0.211 0.524 0.450 0.354 0.370 0.551 0.477 0.335 0.361 0.836 0.799 -0.545 -0.565 0.423 -0.041 -0.717 -0.745 0.012 -0.354 -0.832 0.693 0.757 0.718 0.683 0.936 0.596 0.410 0.041 0.410 0.987 1.001 -0.413 -0.372 0.718 0.279 -0.527 -0.478 0.382 0.034 -0.514 0.343 0.753 0.562 0.300 0.533 0.754 0.483 -0.152 0.091 0.531 0.471 -1.013 -1.097 0.209 -0.287 -1.360 -1.391 -0.273 -0.775 -1.632 -0.001 0.194 0.055 -0.102 0.069 0.342 0.001 -0.774 -0.390 0.567 0.575 0.292 0.302 0.501 0.393 0.272 0.290 0.406 0.339 0.272 0.404 0.521 0.481 0.425 0.465 0.520 0.469 0.369 0.413 0.552 0.545 0.303 0.296 0.470 0.366 0.253 0.251 0.358 0.285 0.212 0.293 0.523 0.472 0.405 0.414 0.519 0.475 0.388 0.415 1.099 1.098 -0.491 -0.494 0.703 0.178 -0.584 -0.580 0.295 -0.096 -0.596 0.881 0.942 0.873 0.781 1.085 0.805 0.565 0.109 0.428 0.526 0.526 0.526 0.526 0.471 0.438 0.404 0.399 0.332 0.320 0.310 0.302 0.988 0.953 0.917 0.880 0.747 0.726 0.698 0.665 -0.406 -0.406 -0.406 -0.406 -0.313 -0.313 -0.313 -0.371 -0.167 -0.235 -0.274 -0.296 -0.421 -0.497 -0.559 -0.599 -0.629 -0.683 -0.713 -0.719 -0.963 0.026 0.157 0.216 0.135 0.208 0.266 0.309 0.256 0.304 0.348 0.377 -1.585 0.035 0.251 0.294 0.172 0.262 0.333 0.370 0.298 0.354 0.404 0.437 -1.666 -0.018 -0.790 -0.162 -1.033 -0.992 -0.952 0.221 -1.654 0.130 -1.037 0.158 1.139 0.624 1.050 0.683 -0.451 -0.418 -0.385 -0.355 0.618 -2.404 0.742 1.771 -0.226 -0.226 -0.226 -0.226 0.533 -2.420 0.655 1.660 -0.309 -0.309 -0.309 -0.309 -0.443 -0.443 -0.443 -0.443

Dividing into X and Y test sets for the model building for 3 dataframes

In [386]:
y_cameraaccessory_dlmul_test = cameraaccessory_dlmul_test.pop('gmv')
X_cameraaccessory_dlmul_test = cameraaccessory_dlmul_test

y_gamingaccessory_dlmul_test = gamingaccessory_dlmul_test.pop('gmv')
X_gamingaccessory_dlmul_test = gamingaccessory_dlmul_test

y_homeaudio_dlmul_test = homeaudio_dlmul_test.pop('gmv')
X_homeaudio_dlmul_test = homeaudio_dlmul_test

X_homeaudio_dlmul_test.head()
Out[386]:
gmv_lag1 gmv_lag2 gmv_lag3 Discount% Discount%_lag1 Discount%_lag2 Discount%_lag3 deliverybdays deliverybdays_lag1 deliverybdays_lag2 deliverybdays_lag3 deliverycdays deliverycdays_lag1 deliverycdays_lag2 deliverycdays_lag3 sla sla_lag1 sla_lag2 sla_lag3 product_procurement_sla product_procurement_sla_lag1 product_procurement_sla_lag2 product_procurement_sla_lag3 is_cod is_cod_lag1 is_cod_lag2 is_cod_lag3 is_mass_market is_mass_market_lag1 is_mass_market_lag2 is_mass_market_lag3 product_vertical_djcontroller product_vertical_djcontroller_lag1 product_vertical_djcontroller_lag2 product_vertical_djcontroller_lag3 product_vertical_dock product_vertical_dock_lag1 product_vertical_dock_lag2 product_vertical_dock_lag3 product_vertical_dockingstation product_vertical_dockingstation_lag1 product_vertical_dockingstation_lag2 product_vertical_dockingstation_lag3 product_vertical_fmradio product_vertical_fmradio_lag1 product_vertical_fmradio_lag2 product_vertical_fmradio_lag3 product_vertical_hifisystem product_vertical_hifisystem_lag1 product_vertical_hifisystem_lag2 product_vertical_hifisystem_lag3 product_vertical_homeaudiospeaker product_vertical_homeaudiospeaker_lag1 product_vertical_homeaudiospeaker_lag2 product_vertical_homeaudiospeaker_lag3 product_vertical_karaokeplayer product_vertical_karaokeplayer_lag1 product_vertical_karaokeplayer_lag2 product_vertical_karaokeplayer_lag3 product_vertical_slingbox product_vertical_slingbox_lag1 product_vertical_slingbox_lag2 product_vertical_slingbox_lag3 product_vertical_soundmixer product_vertical_soundmixer_lag1 product_vertical_soundmixer_lag2 product_vertical_soundmixer_lag3 product_vertical_voicerecorder product_vertical_voicerecorder_lag1 product_vertical_voicerecorder_lag2 product_vertical_voicerecorder_lag3 payday_week payday_week_lag1 payday_week_lag2 payday_week_lag3 holiday_week holiday_week_lag1 holiday_week_lag2 holiday_week_lag3 Total Investment Total Investment_lag1 Total Investment_lag2 Total Investment_lag3 Total Investment_SMA_3 Total Investment_SMA_3_lag1 Total Investment_SMA_3_lag2 Total Investment_SMA_3_lag3 Total Investment_SMA_5 Total Investment_SMA_5_lag1 Total Investment_SMA_5_lag2 Total Investment_SMA_5_lag3 Total Investment_EMA_8 Total Investment_EMA_8_lag1 Total Investment_EMA_8_lag2 Total Investment_EMA_8_lag3 Total_Investment_Ad_Stock Total_Investment_Ad_Stock_lag1 Total_Investment_Ad_Stock_lag2 Total_Investment_Ad_Stock_lag3 TV TV_lag1 TV_lag2 TV_lag3 TV_SMA_3 TV_SMA_3_lag1 TV_SMA_3_lag2 TV_SMA_3_lag3 TV_SMA_5 TV_SMA_5_lag1 TV_SMA_5_lag2 TV_SMA_5_lag3 TV_EMA_8 TV_EMA_8_lag1 TV_EMA_8_lag2 TV_EMA_8_lag3 TV_Ad_Stock TV_Ad_Stock_lag1 TV_Ad_Stock_lag2 TV_Ad_Stock_lag3 Digital Digital_lag1 Digital_lag2 Digital_lag3 Digital_SMA_3 Digital_SMA_3_lag1 Digital_SMA_3_lag2 Digital_SMA_3_lag3 Digital_SMA_5 Digital_SMA_5_lag1 Digital_SMA_5_lag2 Digital_SMA_5_lag3 Digital_EMA_8 Digital_EMA_8_lag1 Digital_EMA_8_lag2 Digital_EMA_8_lag3 Digital_Ad_Stock Digital_Ad_Stock_lag1 Digital_Ad_Stock_lag2 Digital_Ad_Stock_lag3 Sponsorship Sponsorship_lag1 Sponsorship_lag2 Sponsorship_lag3 Sponsorship_SMA_3 Sponsorship_SMA_3_lag1 Sponsorship_SMA_3_lag2 Sponsorship_SMA_3_lag3 Sponsorship_SMA_5 Sponsorship_SMA_5_lag1 Sponsorship_SMA_5_lag2 Sponsorship_SMA_5_lag3 Sponsorship_EMA_8 Sponsorship_EMA_8_lag1 Sponsorship_EMA_8_lag2 Sponsorship_EMA_8_lag3 Sponsorship_Ad_Stock Sponsorship_Ad_Stock_lag1 Sponsorship_Ad_Stock_lag2 Sponsorship_Ad_Stock_lag3 Content Marketing Content Marketing_lag1 Content Marketing_lag2 Content Marketing_lag3 Content Marketing_SMA_3 Content Marketing_SMA_3_lag1 Content Marketing_SMA_3_lag2 Content Marketing_SMA_3_lag3 Content Marketing_SMA_5 Content Marketing_SMA_5_lag1 Content Marketing_SMA_5_lag2 Content Marketing_SMA_5_lag3 Content Marketing_EMA_8 Content Marketing_EMA_8_lag1 Content Marketing_EMA_8_lag2 Content Marketing_EMA_8_lag3 Content_Marketing_Ad_Stock Content_Marketing_Ad_Stock_lag1 Content_Marketing_Ad_Stock_lag2 Content_Marketing_Ad_Stock_lag3 Online marketing Online marketing_lag1 Online marketing_lag2 Online marketing_lag3 Online marketing_SMA_3 Online marketing_SMA_3_lag1 Online marketing_SMA_3_lag2 Online marketing_SMA_3_lag3 Online marketing_SMA_5 Online marketing_SMA_5_lag1 Online marketing_SMA_5_lag2 Online marketing_SMA_5_lag3 Online marketing_EMA_8 Online marketing_EMA_8_lag1 Online marketing_EMA_8_lag2 Online marketing_EMA_8_lag3 Online_marketing_Ad_Stock Online_marketing_Ad_Stock_lag1 Online_marketing_Ad_Stock_lag2 Online_marketing_Ad_Stock_lag3 Affiliates Affiliates_lag1 Affiliates_lag2 Affiliates_lag3 Affiliates_SMA_3 Affiliates_SMA_3_lag1 Affiliates_SMA_3_lag2 Affiliates_SMA_3_lag3 Affiliates_SMA_5 Affiliates_SMA_5_lag1 Affiliates_SMA_5_lag2 Affiliates_SMA_5_lag3 Affiliates_EMA_8 Affiliates_EMA_8_lag1 Affiliates_EMA_8_lag2 Affiliates_EMA_8_lag3 Affiliates_Ad_Stock Affiliates_Ad_Stock_lag1 Affiliates_Ad_Stock_lag2 Affiliates_Ad_Stock_lag3 SEM SEM_lag1 SEM_lag2 SEM_lag3 SEM_SMA_3 SEM_SMA_3_lag1 SEM_SMA_3_lag2 SEM_SMA_3_lag3 SEM_SMA_5 SEM_SMA_5_lag1 SEM_SMA_5_lag2 SEM_SMA_5_lag3 SEM_EMA_8 SEM_EMA_8_lag1 SEM_EMA_8_lag2 SEM_EMA_8_lag3 SEM_Ad_Stock SEM_Ad_Stock_lag1 SEM_Ad_Stock_lag2 SEM_Ad_Stock_lag3 Radio Radio_lag1 Radio_lag2 Radio_lag3 Radio_SMA_3 Radio_SMA_3_lag1 Radio_SMA_3_lag2 Radio_SMA_3_lag3 Radio_SMA_5 Radio_SMA_5_lag1 Radio_SMA_5_lag2 Radio_SMA_5_lag3 Radio_EMA_8 Radio_EMA_8_lag1 Radio_EMA_8_lag2 Radio_EMA_8_lag3 Radio_Ad_Stock Radio_Ad_Stock_lag1 Radio_Ad_Stock_lag2 Radio_Ad_Stock_lag3 Other Other_lag1 Other_lag2 Other_lag3 Other_SMA_3 Other_SMA_3_lag1 Other_SMA_3_lag2 Other_SMA_3_lag3 Other_SMA_5 Other_SMA_5_lag1 Other_SMA_5_lag2 Other_SMA_5_lag3 Other_EMA_8 Other_EMA_8_lag1 Other_EMA_8_lag2 Other_EMA_8_lag3 Other_Ad_Stock Other_Ad_Stock_lag1 Other_Ad_Stock_lag2 Other_Ad_Stock_lag3 NPS NPS_lag1 NPS_lag2 NPS_lag3 NPS_SMA_3 NPS_SMA_3_lag1 NPS_SMA_3_lag2 NPS_SMA_3_lag3 NPS_SMA_5 NPS_SMA_5_lag1 NPS_SMA_5_lag2 NPS_SMA_5_lag3 Stock Index Stock Index_lag1 Stock Index_lag2 Stock Index_lag3 Stock Index_SMA_3 Stock Index_SMA_3_lag1 Stock Index_SMA_3_lag2 Stock Index_SMA_3_lag3 Stock Index_SMA_5 Stock Index_SMA_5_lag1 Stock Index_SMA_5_lag2 Stock Index_SMA_5_lag3 Max Temp Max Temp_lag1 Max Temp_lag2 Max Temp_lag3 Min Temp Min Temp_lag1 Min Temp_lag2 Min Temp_lag3 Mean Temp Mean Temp_lag1 Mean Temp_lag2 Mean Temp_lag3 Heat Deg Days Heat Deg Days_lag1 Heat Deg Days_lag2 Heat Deg Days_lag3 Cool Deg Days Cool Deg Days_lag1 Cool Deg Days_lag2 Cool Deg Days_lag3 Total Rain (mm) Total Rain (mm)_lag1 Total Rain (mm)_lag2 Total Rain (mm)_lag3 Total Snow (cm) Total Snow (cm)_lag1 Total Snow (cm)_lag2 Total Snow (cm)_lag3 Total Precip (mm) Total Precip (mm)_lag1 Total Precip (mm)_lag2 Total Precip (mm)_lag3 Snow on Grnd (cm) Snow on Grnd (cm)_lag1 Snow on Grnd (cm)_lag2 Snow on Grnd (cm)_lag3 Sale Sale_lag1 Sale_lag2 Sale_lag3
31 0.118 -2.071 0.208 0.070 0.167 -0.881 0.137 0.390 0.400 0.409 0.422 0.354 0.365 0.376 0.391 -0.128 0.052 1.393 0.746 1.406 0.459 -1.187 0.370 0.233 0.178 -3.478 0.210 0.192 0.141 -3.803 0.139 1.130 0.644 -1.647 -0.611 0.643 0.536 -2.801 0.454 -0.646 -1.775 -1.682 -1.596 0.372 0.242 -3.798 0.189 0.522 0.527 -3.473 -0.084 -0.016 0.013 -3.784 0.112 0.000 0.000 0.000 0.000 -0.146 -0.146 -0.146 -0.146 -0.588 -0.565 -0.565 -0.535 0.406 0.163 -3.683 -0.067 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.723 -2.901 -2.627 -0.993 -0.490 -2.713 -1.330 -0.926 -0.830 -2.400 -1.046 -2.080 -1.254 -2.772 -1.626 -1.205 -0.385 -2.615 -1.476 -0.969 0.394 -3.855 -3.861 -1.416 -0.515 -4.973 -2.225 -1.938 -1.107 -5.603 -2.428 0.227 -0.912 -3.334 -2.467 -2.253 -0.293 -3.677 -2.282 -1.904 -0.159 -0.522 -0.541 0.505 -0.539 -0.692 0.155 0.401 -0.686 -0.795 0.200 0.931 -0.563 -0.577 0.075 0.268 -0.535 -0.664 -0.023 0.325 1.081 -2.796 -2.710 -0.743 0.031 -3.200 -1.291 -0.887 -0.460 -3.210 -1.130 -1.478 -0.601 -3.026 -1.753 -1.340 0.191 -2.925 -1.509 -0.975 0.154 1.373 1.326 1.283 -0.824 1.479 1.444 1.379 -1.660 1.654 1.572 1.496 -1.900 1.795 1.705 1.620 -1.085 1.111 1.073 1.031 0.310 -3.756 -3.721 -1.661 -0.770 -4.604 -2.450 -2.054 -1.400 -4.998 -2.485 -1.095 -1.109 -3.321 -2.393 -2.111 -0.467 -3.508 -2.228 -1.769 0.274 -3.782 -3.792 -1.777 -0.993 -4.789 -2.706 -2.310 -1.818 -5.460 -2.907 -0.137 -1.247 -3.384 -2.472 -2.217 -0.566 -3.562 -2.359 -1.916 0.151 -1.665 -1.669 -0.196 -0.923 -1.828 -0.649 -0.265 -1.270 -1.876 -0.496 -0.662 -1.195 -1.668 -0.790 -0.484 -0.767 -1.616 -0.805 -0.368 0.526 0.526 0.526 0.526 0.471 0.438 0.404 0.399 0.332 0.320 0.310 0.302 0.988 0.953 0.917 0.880 0.747 0.726 0.698 0.665 -0.406 -0.406 -0.406 -0.406 -0.313 -0.313 -0.313 -0.371 -0.167 -0.235 -0.274 -0.296 -0.421 -0.497 -0.559 -0.599 -0.629 -0.683 -0.713 -0.719 -0.615 0.508 0.469 0.375 0.374 0.475 0.435 0.435 0.451 0.514 0.480 -2.418 -0.754 0.183 0.236 0.267 0.214 0.280 0.313 0.348 0.312 0.359 0.386 -2.420 0.894 0.850 0.720 0.910 0.923 1.573 1.496 1.596 0.736 0.967 0.909 1.043 -0.967 -1.163 -1.163 -1.204 0.676 1.937 1.311 2.154 -0.623 1.810 -1.612 0.885 -0.226 -0.226 -0.226 -0.226 -0.679 1.698 -1.646 0.794 -0.309 -0.309 -0.309 -0.309 -0.443 -0.443 -0.443 -0.443
5 0.207 0.369 0.295 1.152 0.289 0.449 0.317 -1.640 -1.561 -2.012 -1.697 -1.664 -1.603 -2.129 -1.448 -0.743 -0.095 -0.226 0.217 -0.264 0.498 0.471 0.294 0.549 0.355 0.563 0.390 0.410 0.165 0.385 0.235 -1.709 -0.660 0.327 -0.049 0.624 0.454 0.070 0.052 -0.323 0.480 -0.211 0.988 0.212 0.231 0.304 0.275 -0.240 0.064 -0.087 0.357 0.592 0.266 0.548 0.320 0.000 0.000 0.000 0.000 -0.146 -0.146 -0.146 -0.146 -0.588 1.077 -0.565 -0.535 0.475 0.196 0.438 0.385 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 -0.433 -0.336 -0.258 0.455 -0.373 0.001 0.264 0.464 0.059 0.230 0.370 0.564 0.019 0.223 0.400 0.567 -0.219 0.022 0.247 0.485 0.006 -0.005 -0.017 0.433 -0.156 0.074 0.245 0.378 0.033 0.138 0.227 0.343 0.238 0.313 0.402 0.507 0.091 0.188 0.325 0.509 -0.010 -0.036 -0.061 -1.512 -0.150 -0.518 -1.003 -1.760 -0.672 -0.986 -1.384 -0.928 -0.506 -0.563 -0.628 -0.707 -0.357 -0.486 -0.726 -1.231 -0.675 -0.626 -0.581 -1.239 -0.834 -0.977 -1.167 -1.435 -1.084 -1.171 -1.283 -0.212 -0.794 -0.556 -0.329 -0.108 -0.947 -0.795 -0.663 -0.538 -0.016 -0.081 -0.147 0.233 -0.157 -0.026 0.066 0.157 -0.041 0.014 0.065 0.145 -0.088 -0.062 -0.016 0.050 -0.082 -0.025 0.105 0.322 0.286 0.296 0.305 0.605 0.266 0.409 0.535 0.652 0.424 0.514 0.600 0.681 0.508 0.569 0.641 0.725 0.385 0.453 0.545 0.670 0.298 0.291 0.284 0.601 0.246 0.387 0.512 0.628 0.391 0.475 0.557 0.619 0.537 0.581 0.637 0.708 0.408 0.470 0.560 0.691 -0.595 -0.599 -0.602 -0.485 -0.694 -0.654 -0.616 -0.571 -0.669 -0.642 -0.611 -0.079 -0.463 -0.321 -0.151 0.049 -0.616 -0.493 -0.345 -0.156 0.526 0.526 0.526 -0.142 -2.424 0.314 0.332 0.363 0.087 0.151 0.195 0.187 -0.599 -0.416 -0.218 -0.007 -0.641 -0.098 0.527 1.198 -0.406 -0.406 -0.406 2.908 -0.313 1.073 2.253 3.060 1.321 2.024 2.535 2.550 1.071 1.672 2.319 2.977 0.706 1.317 1.948 2.576 0.390 0.194 0.244 0.218 0.245 0.266 0.289 0.311 0.305 0.333 0.360 0.383 0.715 0.196 0.245 0.201 0.244 0.260 0.278 0.296 0.298 0.324 0.350 0.373 0.002 -1.475 -0.168 -0.977 -1.033 -0.992 -0.952 -0.913 -0.327 -1.572 -0.587 -1.428 0.865 1.404 0.979 1.222 -0.451 -0.418 -0.385 -0.355 -0.584 -0.584 0.585 -0.595 -0.226 2.836 -0.226 -0.226 -0.641 0.278 0.501 -0.652 0.407 1.415 -0.309 -0.309 1.843 -0.443 0.999 -0.443
9 0.347 0.262 0.325 0.149 0.371 0.155 0.284 -0.525 -0.981 -1.538 -1.459 -0.529 -1.002 -1.454 -1.494 -0.399 -0.133 0.214 0.240 -0.515 0.202 0.251 0.166 0.308 0.476 0.379 0.444 0.165 0.414 0.302 0.335 -1.709 -0.660 0.644 0.350 0.080 0.389 0.497 0.242 0.911 0.920 0.026 0.799 0.194 0.334 0.331 0.284 -0.104 0.782 0.900 0.639 0.215 0.493 0.289 0.392 0.000 0.000 0.000 0.000 -0.146 -0.146 -0.146 -0.146 1.057 0.471 -0.565 -0.535 0.475 0.517 0.415 0.496 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.771 0.755 -0.258 -0.193 0.479 0.171 -0.220 -0.158 0.246 0.068 -0.144 0.061 0.327 0.165 -0.034 0.107 0.481 0.257 -0.142 -0.023 0.934 0.921 -0.017 -0.028 0.770 0.424 -0.184 -0.200 0.461 0.153 -0.317 -0.165 0.697 0.468 0.072 0.123 0.822 0.604 -0.013 0.028 0.375 0.344 -0.061 -0.086 0.137 -0.026 -0.201 -0.233 -0.046 -0.166 -0.295 -0.535 -0.235 -0.380 -0.567 -0.626 0.107 -0.022 -0.262 -0.314 0.701 0.721 -0.581 -0.538 0.407 0.001 -0.714 -0.661 0.086 -0.226 -0.702 -0.763 -0.000 -0.285 -0.787 -0.617 0.361 0.079 -0.714 -0.616 -0.183 -0.249 -0.147 -0.215 -0.281 -0.297 -0.321 -0.365 -0.427 -0.442 -0.445 -0.328 -0.415 -0.394 -0.353 -0.341 -0.440 -0.428 -0.353 -0.351 0.405 0.414 0.305 0.315 0.364 0.327 0.288 0.306 0.318 0.302 0.289 0.399 0.462 0.472 0.481 0.531 0.401 0.398 0.380 0.422 0.466 0.459 0.284 0.278 0.388 0.312 0.230 0.228 0.291 0.238 0.187 0.292 0.513 0.493 0.470 0.502 0.454 0.429 0.380 0.414 -0.135 -0.138 -0.602 -0.605 -0.354 -0.519 -0.701 -0.697 -0.504 -0.608 -0.715 -0.679 -0.574 -0.611 -0.658 -0.556 -0.374 -0.430 -0.564 -0.488 -2.063 -2.063 0.526 0.526 0.320 0.220 -2.931 -2.932 -0.124 -0.266 -6.721 -0.072 -0.828 -1.016 -1.255 -1.057 -0.392 -0.891 -2.239 -1.633 1.990 1.990 -0.406 -0.406 1.351 0.171 -0.313 -0.371 0.529 -0.600 -0.274 0.216 1.358 0.884 -0.080 0.510 1.580 1.165 -0.546 0.074 0.013 0.147 0.244 0.288 0.223 0.280 0.329 0.366 0.319 0.362 0.401 0.426 -1.901 0.013 0.245 0.289 0.159 0.254 0.328 0.365 0.291 0.349 0.400 0.423 -0.127 0.261 -0.409 -2.757 -1.033 -0.992 -0.952 -0.913 0.113 0.269 -0.880 -1.428 0.592 0.473 1.031 1.174 -0.451 -0.418 -0.385 -0.355 -1.249 1.524 -0.597 1.859 -0.226 -0.226 2.419 3.204 -1.291 1.419 0.482 1.946 -0.309 -0.309 2.131 2.424 -0.443 1.843 -0.443 -0.443
3 0.207 0.262 0.395 1.221 0.257 0.331 0.507 -2.009 -1.696 -1.878 0.422 -2.124 -1.449 -1.991 0.391 -0.995 0.112 0.071 -0.154 0.915 0.207 0.548 0.563 0.608 0.329 0.367 0.594 0.384 0.129 0.293 0.565 0.314 -0.083 -0.083 0.659 0.012 -0.036 0.534 0.334 -0.323 0.920 0.525 -0.454 0.259 0.185 0.098 0.220 -0.240 0.337 -0.322 0.100 0.640 0.244 0.329 0.586 0.000 0.000 0.000 0.000 -0.146 -0.146 -0.146 -0.146 -0.588 -0.565 -0.565 -0.535 0.458 0.319 0.199 0.406 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 -0.433 0.414 0.434 0.455 0.193 0.419 0.577 0.707 0.310 0.521 0.679 0.755 0.358 0.552 0.619 0.666 0.153 0.454 0.537 0.610 0.006 0.457 0.445 0.433 0.273 0.407 0.451 0.488 0.261 0.378 0.433 0.450 0.430 0.533 0.539 0.541 0.324 0.509 0.539 0.569 -0.010 -1.496 -1.504 -1.512 -0.969 -1.729 -0.385 0.220 -1.334 -0.857 0.070 0.333 -0.545 -0.614 -0.039 0.253 -0.709 -1.205 -0.406 0.122 -0.675 -1.353 -1.294 -1.239 -1.312 -1.585 0.114 0.718 -1.435 -0.301 0.628 0.902 -0.492 -0.230 0.388 0.697 -0.904 -0.731 0.082 0.529 -0.016 0.354 0.294 0.233 0.210 0.277 0.258 0.253 0.186 0.251 0.259 0.240 0.116 0.172 0.123 0.076 0.222 0.421 0.393 0.371 0.286 0.589 0.597 0.605 0.517 0.630 0.632 0.641 0.572 0.646 0.655 0.670 0.607 0.688 0.677 0.676 0.509 0.637 0.642 0.650 0.298 0.614 0.608 0.601 0.527 0.637 0.602 0.572 0.562 0.617 0.578 0.561 0.638 0.705 0.662 0.633 0.539 0.668 0.656 0.645 -0.595 -0.479 -0.482 -0.485 -0.609 -0.571 0.180 0.703 -0.617 -0.090 0.619 0.891 -0.160 0.038 0.475 0.742 -0.444 -0.237 0.212 0.560 0.526 -0.142 -0.142 -0.142 0.408 0.405 0.332 0.265 0.217 0.205 0.066 -0.072 -0.101 0.093 -0.447 -1.102 0.594 1.224 0.802 0.121 -0.406 2.908 2.908 2.908 2.253 2.943 2.253 1.090 2.357 2.487 1.383 0.216 2.137 2.814 1.790 0.404 1.884 2.541 2.196 1.624 0.390 0.076 0.159 0.218 0.189 0.219 0.259 0.295 0.280 0.306 0.337 0.370 0.715 0.048 0.139 0.201 0.175 0.200 0.249 0.292 0.267 0.294 0.333 0.370 -0.426 -1.356 -1.368 -0.253 -1.033 -0.992 -0.952 -0.913 -0.714 -1.572 -1.497 -3.518 0.979 1.210 1.284 1.142 -0.451 -0.418 -0.385 -0.355 0.590 -0.584 0.301 1.396 -0.226 -0.226 -0.226 2.419 0.506 -0.641 0.223 1.539 -0.309 -0.309 5.500 -0.309 0.999 -0.443 -0.443 -0.443
18 0.121 0.184 0.359 -0.253 0.127 0.232 0.284 0.839 0.861 0.848 0.901 0.882 0.905 0.894 0.957 -0.785 -0.218 -0.083 0.337 0.127 0.062 -0.049 0.226 0.060 0.205 0.253 0.241 -0.061 0.108 0.188 0.448 1.130 -1.647 0.644 0.350 -3.284 -2.051 -0.311 0.961 1.161 1.419 1.119 1.709 0.015 0.190 0.198 0.516 -0.844 -0.352 0.272 0.988 -0.156 0.045 0.148 0.408 0.000 0.000 0.000 0.000 -0.146 -0.146 -0.146 -0.146 1.488 3.381 1.508 2.307 0.831 0.696 0.470 1.064 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.178 0.218 0.253 0.193 0.196 0.200 0.205 0.212 0.197 0.216 0.236 0.380 0.178 0.235 0.287 0.337 0.161 0.208 0.247 0.281 -0.386 -0.397 -0.407 0.539 -0.644 -0.034 0.291 0.510 -0.042 0.164 0.323 0.577 0.264 0.409 0.566 0.734 -0.078 0.130 0.386 0.681 -1.123 -1.134 -1.146 -0.749 -1.331 -1.177 -1.047 -0.943 -1.243 -1.174 -1.119 -0.751 -1.331 -1.217 -1.086 -0.937 -1.336 -1.227 -1.060 -0.818 0.243 0.273 0.302 0.294 0.202 0.221 0.239 0.257 0.205 0.230 0.254 0.407 0.130 0.186 0.239 0.291 0.173 0.224 0.271 0.319 0.176 0.113 0.050 -2.033 0.067 -0.312 -0.907 -2.480 -0.523 -0.931 -1.507 -1.900 -0.626 -0.851 -1.155 -1.637 -0.206 -0.485 -1.006 -2.701 0.426 0.435 0.444 0.345 0.435 0.403 0.369 0.341 0.381 0.367 0.358 0.375 0.469 0.472 0.474 0.473 0.437 0.441 0.437 0.419 0.344 0.337 0.331 0.365 0.303 0.312 0.322 0.336 0.280 0.286 0.297 0.333 0.457 0.466 0.479 0.499 0.387 0.406 0.430 0.465 -0.034 -0.037 -0.039 -0.477 -0.099 -0.242 -0.398 -0.562 -0.286 -0.379 -0.473 -0.491 -0.495 -0.548 -0.618 -0.711 -0.196 -0.211 -0.264 -0.385 -2.048 -2.048 -2.048 0.526 0.353 0.278 0.169 -2.912 -0.046 -0.134 -0.273 -0.283 -0.904 -1.135 -1.449 -1.946 -0.232 -0.543 -1.136 -3.156 -0.411 -0.411 -0.411 -0.406 -0.318 -1.008 -2.188 -0.371 -0.935 -1.715 -2.955 -0.677 -0.338 -0.381 -0.401 -0.391 0.281 0.152 -0.077 -0.555 -0.498 0.084 0.165 0.320 0.166 0.257 0.329 0.391 0.306 0.361 0.410 0.448 0.784 0.201 0.249 0.298 0.248 0.293 0.334 0.373 0.332 0.370 0.406 0.426 0.265 0.566 0.453 0.494 0.201 -0.171 0.180 0.342 0.381 0.455 0.442 0.382 0.206 0.151 0.271 0.441 -0.451 -3.688 -0.385 -0.355 -1.739 0.804 0.147 1.009 -0.226 -0.226 -0.226 -0.226 -1.770 0.715 0.073 0.915 -0.309 -0.309 -0.309 -0.309 -0.443 -0.443 -0.443 -0.443

Building Linear Regression model for cameraaccessory

In [387]:
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
from sklearn.metrics import mean_squared_error

cameraaccessory_dlmul_model = LinearRegression().fit(X_cameraaccessory_dlmul_train, y_cameraaccessory_dlmul_train)
y_cameraaccessory_dlmul_test_pred = cameraaccessory_dlmul_model.predict(X_cameraaccessory_dlmul_test)

print('R2 Score: {}'.format(r2_score(y_cameraaccessory_dlmul_test, y_cameraaccessory_dlmul_test_pred)))
print('Mean Squared Error: {}'.format(mean_squared_error(y_cameraaccessory_dlmul_test, y_cameraaccessory_dlmul_test_pred)))
R2 Score: 0.7745573779806632
Mean Squared Error: 0.49997901702155473
With Simple Linear Regression, we get a r2 score of 0.77 and mse of 0.50

Building Linear Regression model for cameraaccessory using K-fold Cross Validation

We will use GridSearchCV method and 5 fold cross validation method for our linear regression.

In [388]:
y_cameraaccessory_dlmul = cameraaccessory_dlmul_df.pop('gmv')
X_cameraaccessory_dlmul = cameraaccessory_dlmul_df
In [389]:
# Make cross validated predictions
from sklearn.model_selection import cross_val_score,cross_val_predict
from sklearn import metrics

cameraaccessory_dlmul_model_cv = LinearRegression().fit(X_cameraaccessory_dlmul, y_cameraaccessory_dlmul)
cameraaccessory_dlmul_predictions_cv = cross_val_predict(cameraaccessory_dlmul_model_cv, X_cameraaccessory_dlmul, \
                                                         y_cameraaccessory_dlmul, cv=10)
accuracy = metrics.r2_score(y_cameraaccessory_dlmul, cameraaccessory_dlmul_predictions_cv)
print("Cross-Predicted Accuracy:", accuracy)
print('Mean Squared Error: {}'.format(mean_squared_error(y_cameraaccessory_dlmul, cameraaccessory_dlmul_predictions_cv)))
Cross-Predicted Accuracy: 0.8161429091180213
Mean Squared Error: 0.18385709088197863
With Simple Linear Regression, using cross validation, we get r2 score of 0.82 and mse score of 0.18

Determining Feature Importance for cameraaccessory with model with cv

In [390]:
# linear regression model parameters
#Limiting floats output to 3 decimal points
pd.set_option('display.float_format', lambda x: '{:.3f}'.format(x)) 
pd.set_option('display.precision',1)


cameraaccessory_lr_model_parameters = list(cameraaccessory_dlmul_model_cv.coef_)
cameraaccessory_lr_model_parameters.insert(0, cameraaccessory_dlmul_model_cv.intercept_)
cameraaccessory_lr_model_parameters = [round(x, 3) for x in cameraaccessory_lr_model_parameters]
cols = X_cameraaccessory_dlmul_test.columns
cols = cols.insert(0, "constant")
cameraaccessory_lr_coef = list(zip(cols, cameraaccessory_lr_model_parameters))
cameraaccessory_lr_coef
Out[390]:
[('constant', 0.0),
 ('gmv_lag1', 0.003),
 ('gmv_lag2', -0.007),
 ('gmv_lag3', -0.002),
 ('Discount%', -0.001),
 ('Discount%_lag1', 0.007),
 ('Discount%_lag2', 0.019),
 ('Discount%_lag3', -0.008),
 ('deliverybdays', -0.002),
 ('deliverybdays_lag1', -0.009),
 ('deliverybdays_lag2', 0.004),
 ('deliverybdays_lag3', -0.002),
 ('deliverycdays', -0.002),
 ('deliverycdays_lag1', -0.007),
 ('deliverycdays_lag2', 0.005),
 ('deliverycdays_lag3', -0.002),
 ('sla', -0.035),
 ('sla_lag1', -0.004),
 ('sla_lag2', 0.003),
 ('sla_lag3', 0.014),
 ('product_procurement_sla', 0.052),
 ('product_procurement_sla_lag1', -0.02),
 ('product_procurement_sla_lag2', -0.027),
 ('product_procurement_sla_lag3', 0.023),
 ('is_cod', 0.053),
 ('is_cod_lag1', 0.009),
 ('is_cod_lag2', -0.003),
 ('is_cod_lag3', -0.002),
 ('is_mass_market', 0.061),
 ('is_mass_market_lag1', 0.005),
 ('is_mass_market_lag2', -0.009),
 ('is_mass_market_lag3', -0.006),
 ('product_vertical_cameraaccessory', 0.06),
 ('product_vertical_cameraaccessory_lag1', 0.01),
 ('product_vertical_cameraaccessory_lag2', -0.01),
 ('product_vertical_cameraaccessory_lag3', -0.008),
 ('product_vertical_camerabag', 0.056),
 ('product_vertical_camerabag_lag1', 0.007),
 ('product_vertical_camerabag_lag2', -0.003),
 ('product_vertical_camerabag_lag3', -0.009),
 ('product_vertical_camerabattery', 0.059),
 ('product_vertical_camerabattery_lag1', 0.004),
 ('product_vertical_camerabattery_lag2', -0.008),
 ('product_vertical_camerabattery_lag3', -0.006),
 ('product_vertical_camerabatterycharger', 0.054),
 ('product_vertical_camerabatterycharger_lag1', 0.01),
 ('product_vertical_camerabatterycharger_lag2', -0.008),
 ('product_vertical_camerabatterycharger_lag3', -0.009),
 ('product_vertical_camerabatterygrip', 0.039),
 ('product_vertical_camerabatterygrip_lag1', 0.007),
 ('product_vertical_camerabatterygrip_lag2', -0.01),
 ('product_vertical_camerabatterygrip_lag3', -0.008),
 ('product_vertical_cameraeyecup', -0.006),
 ('product_vertical_cameraeyecup_lag1', -0.013),
 ('product_vertical_cameraeyecup_lag2', 0.01),
 ('product_vertical_cameraeyecup_lag3', 0.011),
 ('product_vertical_camerafilmrolls', 0.015),
 ('product_vertical_camerafilmrolls_lag1', 0.002),
 ('product_vertical_camerafilmrolls_lag2', 0.003),
 ('product_vertical_camerafilmrolls_lag3', 0.01),
 ('product_vertical_camerahousing', -0.008),
 ('product_vertical_camerahousing_lag1', -0.011),
 ('product_vertical_camerahousing_lag2', 0.001),
 ('product_vertical_camerahousing_lag3', 0.012),
 ('product_vertical_cameraledlight', 0.0),
 ('product_vertical_cameraledlight_lag1', 0.0),
 ('product_vertical_cameraledlight_lag2', 0.0),
 ('product_vertical_cameraledlight_lag3', 0.0),
 ('product_vertical_cameramicrophone', -0.009),
 ('product_vertical_cameramicrophone_lag1', 0.003),
 ('product_vertical_cameramicrophone_lag2', 0.002),
 ('product_vertical_cameramicrophone_lag3', 0.001),
 ('product_vertical_cameramount', 0.019),
 ('product_vertical_cameramount_lag1', 0.014),
 ('product_vertical_cameramount_lag2', 0.01),
 ('product_vertical_cameramount_lag3', -0.011),
 ('product_vertical_cameraremotecontrol', 0.055),
 ('product_vertical_cameraremotecontrol_lag1', 0.007),
 ('product_vertical_cameraremotecontrol_lag2', -0.005),
 ('product_vertical_cameraremotecontrol_lag3', -0.005),
 ('product_vertical_cameratripod', 0.059),
 ('product_vertical_cameratripod_lag1', 0.003),
 ('product_vertical_cameratripod_lag2', -0.006),
 ('product_vertical_cameratripod_lag3', -0.014),
 ('product_vertical_extensiontube', -0.006),
 ('product_vertical_extensiontube_lag1', -0.004),
 ('product_vertical_extensiontube_lag2', 0.005),
 ('product_vertical_extensiontube_lag3', -0.006),
 ('product_vertical_filter', 0.055),
 ('product_vertical_filter_lag1', 0.005),
 ('product_vertical_filter_lag2', -0.004),
 ('product_vertical_filter_lag3', -0.005),
 ('product_vertical_flash', 0.034),
 ('product_vertical_flash_lag1', 0.002),
 ('product_vertical_flash_lag2', -0.0),
 ('product_vertical_flash_lag3', 0.001),
 ('product_vertical_flashshoeadapter', 0.0),
 ('product_vertical_flashshoeadapter_lag1', 0.0),
 ('product_vertical_flashshoeadapter_lag2', 0.0),
 ('product_vertical_flashshoeadapter_lag3', 0.0),
 ('product_vertical_lens', 0.06),
 ('product_vertical_lens_lag1', 0.012),
 ('product_vertical_lens_lag2', -0.006),
 ('product_vertical_lens_lag3', -0.01),
 ('product_vertical_reflectorumbrella', 0.003),
 ('product_vertical_reflectorumbrella_lag1', -0.006),
 ('product_vertical_reflectorumbrella_lag2', -0.002),
 ('product_vertical_reflectorumbrella_lag3', 0.011),
 ('product_vertical_softbox', 0.003),
 ('product_vertical_softbox_lag1', -0.009),
 ('product_vertical_softbox_lag2', 0.001),
 ('product_vertical_softbox_lag3', 0.014),
 ('product_vertical_strap', 0.046),
 ('product_vertical_strap_lag1', 0.012),
 ('product_vertical_strap_lag2', -0.01),
 ('product_vertical_strap_lag3', -0.016),
 ('product_vertical_teleconverter', 0.0),
 ('product_vertical_teleconverter_lag1', 0.0),
 ('product_vertical_teleconverter_lag2', 0.0),
 ('product_vertical_teleconverter_lag3', 0.0),
 ('product_vertical_telescope', 0.037),
 ('product_vertical_telescope_lag1', 0.012),
 ('product_vertical_telescope_lag2', -0.001),
 ('product_vertical_telescope_lag3', -0.008),
 ('payday_week', 0.0),
 ('payday_week_lag1', 0.0),
 ('payday_week_lag2', 0.0),
 ('payday_week_lag3', 0.0),
 ('holiday_week', 0.0),
 ('holiday_week_lag1', 0.0),
 ('holiday_week_lag2', 0.0),
 ('holiday_week_lag3', 0.0),
 ('Total Investment', 0.008),
 ('Total Investment_lag1', 0.005),
 ('Total Investment_lag2', 0.004),
 ('Total Investment_lag3', -0.001),
 ('Total Investment_SMA_3', 0.003),
 ('Total Investment_SMA_3_lag1', 0.0),
 ('Total Investment_SMA_3_lag2', -0.007),
 ('Total Investment_SMA_3_lag3', -0.003),
 ('Total Investment_SMA_5', -0.014),
 ('Total Investment_SMA_5_lag1', -0.004),
 ('Total Investment_SMA_5_lag2', 0.012),
 ('Total Investment_SMA_5_lag3', 0.012),
 ('Total Investment_EMA_8', 0.001),
 ('Total Investment_EMA_8_lag1', 0.0),
 ('Total Investment_EMA_8_lag2', 0.007),
 ('Total Investment_EMA_8_lag3', 0.005),
 ('Total_Investment_Ad_Stock', 0.001),
 ('Total_Investment_Ad_Stock_lag1', -0.0),
 ('Total_Investment_Ad_Stock_lag2', 0.004),
 ('Total_Investment_Ad_Stock_lag3', 0.001),
 ('TV', 0.016),
 ('TV_lag1', 0.014),
 ('TV_lag2', -0.005),
 ('TV_lag3', -0.007),
 ('TV_SMA_3', -0.011),
 ('TV_SMA_3_lag1', -0.01),
 ('TV_SMA_3_lag2', 0.021),
 ('TV_SMA_3_lag3', 0.009),
 ('TV_SMA_5', 0.001),
 ('TV_SMA_5_lag1', 0.008),
 ('TV_SMA_5_lag2', -0.009),
 ('TV_SMA_5_lag3', -0.018),
 ('TV_EMA_8', -0.001),
 ('TV_EMA_8_lag1', 0.004),
 ('TV_EMA_8_lag2', 0.001),
 ('TV_EMA_8_lag3', 0.002),
 ('TV_Ad_Stock', -0.002),
 ('TV_Ad_Stock_lag1', 0.002),
 ('TV_Ad_Stock_lag2', 0.0),
 ('TV_Ad_Stock_lag3', 0.0),
 ('Digital', 0.0),
 ('Digital_lag1', 0.004),
 ('Digital_lag2', -0.001),
 ('Digital_lag3', 0.006),
 ('Digital_SMA_3', 0.001),
 ('Digital_SMA_3_lag1', -0.0),
 ('Digital_SMA_3_lag2', 0.003),
 ('Digital_SMA_3_lag3', 0.003),
 ('Digital_SMA_5', 0.002),
 ('Digital_SMA_5_lag1', -0.002),
 ('Digital_SMA_5_lag2', -0.004),
 ('Digital_SMA_5_lag3', -0.005),
 ('Digital_EMA_8', -0.001),
 ('Digital_EMA_8_lag1', 0.0),
 ('Digital_EMA_8_lag2', 0.0),
 ('Digital_EMA_8_lag3', 0.006),
 ('Digital_Ad_Stock', -0.0),
 ('Digital_Ad_Stock_lag1', 0.001),
 ('Digital_Ad_Stock_lag2', -0.003),
 ('Digital_Ad_Stock_lag3', 0.002),
 ('Sponsorship', 0.011),
 ('Sponsorship_lag1', 0.009),
 ('Sponsorship_lag2', 0.002),
 ('Sponsorship_lag3', -0.006),
 ('Sponsorship_SMA_3', -0.0),
 ('Sponsorship_SMA_3_lag1', -0.003),
 ('Sponsorship_SMA_3_lag2', -0.002),
 ('Sponsorship_SMA_3_lag3', -0.004),
 ('Sponsorship_SMA_5', -0.013),
 ('Sponsorship_SMA_5_lag1', -0.005),
 ('Sponsorship_SMA_5_lag2', 0.006),
 ('Sponsorship_SMA_5_lag3', 0.003),
 ('Sponsorship_EMA_8', 0.0),
 ('Sponsorship_EMA_8_lag1', 0.001),
 ('Sponsorship_EMA_8_lag2', 0.006),
 ('Sponsorship_EMA_8_lag3', 0.003),
 ('Sponsorship_Ad_Stock', -0.0),
 ('Sponsorship_Ad_Stock_lag1', -0.0),
 ('Sponsorship_Ad_Stock_lag2', 0.003),
 ('Sponsorship_Ad_Stock_lag3', -0.002),
 ('Content Marketing', -0.012),
 ('Content Marketing_lag1', -0.009),
 ('Content Marketing_lag2', 0.0),
 ('Content Marketing_lag3', -0.002),
 ('Content Marketing_SMA_3', -0.001),
 ('Content Marketing_SMA_3_lag1', -0.003),
 ('Content Marketing_SMA_3_lag2', -0.002),
 ('Content Marketing_SMA_3_lag3', -0.007),
 ('Content Marketing_SMA_5', 0.001),
 ('Content Marketing_SMA_5_lag1', -0.004),
 ('Content Marketing_SMA_5_lag2', -0.004),
 ('Content Marketing_SMA_5_lag3', -0.008),
 ('Content Marketing_EMA_8', 0.0),
 ('Content Marketing_EMA_8_lag1', -0.003),
 ('Content Marketing_EMA_8_lag2', -0.006),
 ('Content Marketing_EMA_8_lag3', -0.004),
 ('Content_Marketing_Ad_Stock', -0.003),
 ('Content_Marketing_Ad_Stock_lag1', -0.003),
 ('Content_Marketing_Ad_Stock_lag2', 0.001),
 ('Content_Marketing_Ad_Stock_lag3', 0.0),
 ('Online marketing', 0.011),
 ('Online marketing_lag1', 0.011),
 ('Online marketing_lag2', -0.001),
 ('Online marketing_lag3', -0.005),
 ('Online marketing_SMA_3', -0.005),
 ('Online marketing_SMA_3_lag1', -0.005),
 ('Online marketing_SMA_3_lag2', 0.013),
 ('Online marketing_SMA_3_lag3', 0.008),
 ('Online marketing_SMA_5', -0.002),
 ('Online marketing_SMA_5_lag1', 0.005),
 ('Online marketing_SMA_5_lag2', 0.001),
 ('Online marketing_SMA_5_lag3', -0.004),
 ('Online marketing_EMA_8', 0.001),
 ('Online marketing_EMA_8_lag1', 0.004),
 ('Online marketing_EMA_8_lag2', 0.005),
 ('Online marketing_EMA_8_lag3', 0.004),
 ('Online_marketing_Ad_Stock', -0.001),
 ('Online_marketing_Ad_Stock_lag1', 0.002),
 ('Online_marketing_Ad_Stock_lag2', 0.003),
 ('Online_marketing_Ad_Stock_lag3', 0.001),
 ('Affiliates', 0.011),
 ('Affiliates_lag1', 0.012),
 ('Affiliates_lag2', -0.003),
 ('Affiliates_lag3', -0.005),
 ('Affiliates_SMA_3', -0.005),
 ('Affiliates_SMA_3_lag1', -0.005),
 ('Affiliates_SMA_3_lag2', 0.019),
 ('Affiliates_SMA_3_lag3', 0.01),
 ('Affiliates_SMA_5', 0.007),
 ('Affiliates_SMA_5_lag1', 0.007),
 ('Affiliates_SMA_5_lag2', -0.007),
 ('Affiliates_SMA_5_lag3', -0.014),
 ('Affiliates_EMA_8', 0.001),
 ('Affiliates_EMA_8_lag1', 0.005),
 ('Affiliates_EMA_8_lag2', 0.002),
 ('Affiliates_EMA_8_lag3', 0.002),
 ('Affiliates_Ad_Stock', -0.0),
 ('Affiliates_Ad_Stock_lag1', 0.003),
 ('Affiliates_Ad_Stock_lag2', 0.001),
 ('Affiliates_Ad_Stock_lag3', -0.0),
 ('SEM', -0.001),
 ('SEM_lag1', -0.0),
 ('SEM_lag2', 0.0),
 ('SEM_lag3', -0.0),
 ('SEM_SMA_3', -0.0),
 ('SEM_SMA_3_lag1', -0.004),
 ('SEM_SMA_3_lag2', -0.003),
 ('SEM_SMA_3_lag3', 0.0),
 ('SEM_SMA_5', -0.008),
 ('SEM_SMA_5_lag1', -0.006),
 ('SEM_SMA_5_lag2', 0.005),
 ('SEM_SMA_5_lag3', 0.007),
 ('SEM_EMA_8', -0.002),
 ('SEM_EMA_8_lag1', -0.001),
 ('SEM_EMA_8_lag2', 0.005),
 ('SEM_EMA_8_lag3', 0.009),
 ('SEM_Ad_Stock', -0.002),
 ('SEM_Ad_Stock_lag1', -0.001),
 ('SEM_Ad_Stock_lag2', 0.0),
 ('SEM_Ad_Stock_lag3', 0.001),
 ('Radio', 0.005),
 ('Radio_lag1', 0.003),
 ('Radio_lag2', 0.0),
 ('Radio_lag3', -0.006),
 ('Radio_SMA_3', -0.006),
 ('Radio_SMA_3_lag1', 0.002),
 ('Radio_SMA_3_lag2', 0.01),
 ('Radio_SMA_3_lag3', 0.001),
 ('Radio_SMA_5', 0.011),
 ('Radio_SMA_5_lag1', -0.007),
 ('Radio_SMA_5_lag2', -0.003),
 ('Radio_SMA_5_lag3', 0.006),
 ('Radio_EMA_8', 0.001),
 ('Radio_EMA_8_lag1', -0.001),
 ('Radio_EMA_8_lag2', -0.0),
 ('Radio_EMA_8_lag3', 0.002),
 ('Radio_Ad_Stock', -0.001),
 ('Radio_Ad_Stock_lag1', -0.001),
 ('Radio_Ad_Stock_lag2', 0.004),
 ('Radio_Ad_Stock_lag3', 0.004),
 ('Other', -0.0),
 ('Other_lag1', -0.001),
 ('Other_lag2', -0.001),
 ('Other_lag3', 0.008),
 ('Other_SMA_3', 0.002),
 ('Other_SMA_3_lag1', 0.006),
 ('Other_SMA_3_lag2', -0.005),
 ('Other_SMA_3_lag3', -0.002),
 ('Other_SMA_5', -0.002),
 ('Other_SMA_5_lag1', 0.001),
 ('Other_SMA_5_lag2', -0.005),
 ('Other_SMA_5_lag3', -0.005),
 ('Other_EMA_8', -0.004),
 ('Other_EMA_8_lag1', -0.004),
 ('Other_EMA_8_lag2', -0.002),
 ('Other_EMA_8_lag3', -0.0),
 ('Other_Ad_Stock', -0.004),
 ('Other_Ad_Stock_lag1', -0.002),
 ('Other_Ad_Stock_lag2', 0.0),
 ('Other_Ad_Stock_lag3', -0.0),
 ('NPS', -0.007),
 ('NPS_lag1', -0.009),
 ('NPS_lag2', 0.008),
 ('NPS_lag3', 0.003),
 ('NPS_SMA_3', 0.008),
 ('NPS_SMA_3_lag1', 0.003),
 ('NPS_SMA_3_lag2', -0.028),
 ('NPS_SMA_3_lag3', -0.01),
 ('NPS_SMA_5', -0.028),
 ('NPS_SMA_5_lag1', -0.01),
 ('NPS_SMA_5_lag2', 0.021),
 ('NPS_SMA_5_lag3', 0.025),
 ('Stock Index', -0.009),
 ('Stock Index_lag1', -0.007),
 ('Stock Index_lag2', 0.008),
 ('Stock Index_lag3', 0.004),
 ('Stock Index_SMA_3', 0.008),
 ('Stock Index_SMA_3_lag1', 0.004),
 ('Stock Index_SMA_3_lag2', -0.027),
 ('Stock Index_SMA_3_lag3', -0.009),
 ('Stock Index_SMA_5', -0.027),
 ('Stock Index_SMA_5_lag1', -0.009),
 ('Stock Index_SMA_5_lag2', 0.021),
 ('Stock Index_SMA_5_lag3', 0.025),
 ('Max Temp', 0.002),
 ('Max Temp_lag1', -0.009),
 ('Max Temp_lag2', 0.003),
 ('Max Temp_lag3', 0.001),
 ('Min Temp', -0.002),
 ('Min Temp_lag1', -0.008),
 ('Min Temp_lag2', -0.0),
 ('Min Temp_lag3', 0.002),
 ('Mean Temp', 0.009),
 ('Mean Temp_lag1', -0.012),
 ('Mean Temp_lag2', -0.001),
 ('Mean Temp_lag3', 0.008),
 ('Heat Deg Days', -0.001),
 ('Heat Deg Days_lag1', 0.005),
 ('Heat Deg Days_lag2', -0.001),
 ('Heat Deg Days_lag3', -0.012),
 ('Cool Deg Days', 0.019),
 ('Cool Deg Days_lag1', -0.005),
 ('Cool Deg Days_lag2', -0.005),
 ('Cool Deg Days_lag3', -0.005),
 ('Total Rain (mm)', 0.013),
 ('Total Rain (mm)_lag1', -0.005),
 ('Total Rain (mm)_lag2', -0.004),
 ('Total Rain (mm)_lag3', -0.003),
 ('Total Snow (cm)', -0.008),
 ('Total Snow (cm)_lag1', -0.003),
 ('Total Snow (cm)_lag2', 0.006),
 ('Total Snow (cm)_lag3', -0.001),
 ('Total Precip (mm)', 0.01),
 ('Total Precip (mm)_lag1', -0.004),
 ('Total Precip (mm)_lag2', -0.002),
 ('Total Precip (mm)_lag3', -0.005),
 ('Snow on Grnd (cm)', -0.001),
 ('Snow on Grnd (cm)_lag1', 0.006),
 ('Snow on Grnd (cm)_lag2', -0.006),
 ('Snow on Grnd (cm)_lag3', 0.007),
 ('Sale', -0.007),
 ('Sale_lag1', -0.002),
 ('Sale_lag2', 0.006),
 ('Sale_lag3', -0.006)]
In [391]:
cameraaccessory_lr_coef_df = pd.DataFrame(cameraaccessory_lr_coef)
col_rename = {0:'Features',1: 'Coefficients'}
cameraaccessory_lr_coef_df = cameraaccessory_lr_coef_df.rename(columns=col_rename)
cameraaccessory_lr_coef_df = cameraaccessory_lr_coef_df.iloc[1:,:]
cameraaccessory_lr_coef_df = cameraaccessory_lr_coef_df.loc[cameraaccessory_lr_coef_df['Coefficients']!=0.0]
cameraaccessory_lr_coef_df = cameraaccessory_lr_coef_df.sort_values(by=['Coefficients'], ascending = False)
cameraaccessory_lr_coef_df
Out[391]:
Features Coefficients
28 is_mass_market 0.061
100 product_vertical_lens 0.060
32 product_vertical_cameraaccessory 0.060
40 product_vertical_camerabattery 0.059
80 product_vertical_cameratripod 0.059
36 product_vertical_camerabag 0.056
88 product_vertical_filter 0.055
76 product_vertical_cameraremotecontrol 0.055
44 product_vertical_camerabatterycharger 0.054
24 is_cod 0.053
20 product_procurement_sla 0.052
112 product_vertical_strap 0.046
48 product_vertical_camerabatterygrip 0.039
120 product_vertical_telescope 0.037
92 product_vertical_flash 0.034
355 Stock Index_SMA_5_lag3 0.025
343 NPS_SMA_5_lag3 0.025
23 product_procurement_sla_lag3 0.023
354 Stock Index_SMA_5_lag2 0.021
342 NPS_SMA_5_lag2 0.021
158 TV_SMA_3_lag2 0.021
72 product_vertical_cameramount 0.019
372 Cool Deg Days 0.019
6 Discount%_lag2 0.019
258 Affiliates_SMA_3_lag2 0.019
152 TV 0.016
56 product_vertical_camerafilmrolls 0.015
73 product_vertical_cameramount_lag1 0.014
19 sla_lag3 0.014
153 TV_lag1 0.014
111 product_vertical_softbox_lag3 0.014
376 Total Rain (mm) 0.013
238 Online marketing_SMA_3_lag2 0.013
63 product_vertical_camerahousing_lag3 0.012
142 Total Investment_SMA_5_lag2 0.012
101 product_vertical_lens_lag1 0.012
113 product_vertical_strap_lag1 0.012
121 product_vertical_telescope_lag1 0.012
143 Total Investment_SMA_5_lag3 0.012
253 Affiliates_lag1 0.012
300 Radio_SMA_5 0.011
233 Online marketing_lag1 0.011
107 product_vertical_reflectorumbrella_lag3 0.011
232 Online marketing 0.011
252 Affiliates 0.011
55 product_vertical_cameraeyecup_lag3 0.011
192 Sponsorship 0.011
384 Total Precip (mm) 0.010
59 product_vertical_camerafilmrolls_lag3 0.010
259 Affiliates_SMA_3_lag3 0.010
74 product_vertical_cameramount_lag2 0.010
33 product_vertical_cameraaccessory_lag1 0.010
54 product_vertical_cameraeyecup_lag2 0.010
298 Radio_SMA_3_lag2 0.010
45 product_vertical_camerabatterycharger_lag1 0.010
193 Sponsorship_lag1 0.009
364 Mean Temp 0.009
159 TV_SMA_3_lag3 0.009
25 is_cod_lag1 0.009
287 SEM_EMA_8_lag3 0.009
161 TV_SMA_5_lag1 0.008
346 Stock Index_lag2 0.008
367 Mean Temp_lag3 0.008
348 Stock Index_SMA_3 0.008
239 Online marketing_SMA_3_lag3 0.008
315 Other_lag3 0.008
132 Total Investment 0.008
336 NPS_SMA_3 0.008
334 NPS_lag2 0.008
37 product_vertical_camerabag_lag1 0.007
49 product_vertical_camerabatterygrip_lag1 0.007
260 Affiliates_SMA_5 0.007
77 product_vertical_cameraremotecontrol_lag1 0.007
261 Affiliates_SMA_5_lag1 0.007
5 Discount%_lag1 0.007
146 Total Investment_EMA_8_lag2 0.007
391 Snow on Grnd (cm)_lag3 0.007
283 SEM_SMA_5_lag3 0.007
187 Digital_EMA_8_lag3 0.006
202 Sponsorship_SMA_5_lag2 0.006
394 Sale_lag2 0.006
317 Other_SMA_3_lag1 0.006
206 Sponsorship_EMA_8_lag2 0.006
382 Total Snow (cm)_lag2 0.006
175 Digital_lag3 0.006
389 Snow on Grnd (cm)_lag1 0.006
303 Radio_SMA_5_lag3 0.006
14 deliverycdays_lag2 0.005
265 Affiliates_EMA_8_lag1 0.005
369 Heat Deg Days_lag1 0.005
147 Total Investment_EMA_8_lag3 0.005
246 Online marketing_EMA_8_lag2 0.005
133 Total Investment_lag1 0.005
241 Online marketing_SMA_5_lag1 0.005
282 SEM_SMA_5_lag2 0.005
286 SEM_EMA_8_lag2 0.005
89 product_vertical_filter_lag1 0.005
86 product_vertical_extensiontube_lag2 0.005
292 Radio 0.005
29 is_mass_market_lag1 0.005
41 product_vertical_camerabattery_lag1 0.004
173 Digital_lag1 0.004
134 Total Investment_lag2 0.004
311 Radio_Ad_Stock_lag3 0.004
310 Radio_Ad_Stock_lag2 0.004
165 TV_EMA_8_lag1 0.004
347 Stock Index_lag3 0.004
245 Online marketing_EMA_8_lag1 0.004
349 Stock Index_SMA_3_lag1 0.004
10 deliverybdays_lag2 0.004
247 Online marketing_EMA_8_lag3 0.004
150 Total_Investment_Ad_Stock_lag2 0.004
179 Digital_SMA_3_lag3 0.003
203 Sponsorship_SMA_5_lag3 0.003
250 Online_marketing_Ad_Stock_lag2 0.003
210 Sponsorship_Ad_Stock_lag2 0.003
207 Sponsorship_EMA_8_lag3 0.003
178 Digital_SMA_3_lag2 0.003
1 gmv_lag1 0.003
69 product_vertical_cameramicrophone_lag1 0.003
81 product_vertical_cameratripod_lag1 0.003
18 sla_lag2 0.003
358 Max Temp_lag2 0.003
337 NPS_SMA_3_lag1 0.003
335 NPS_lag3 0.003
58 product_vertical_camerafilmrolls_lag2 0.003
136 Total Investment_SMA_3 0.003
293 Radio_lag1 0.003
269 Affiliates_Ad_Stock_lag1 0.003
104 product_vertical_reflectorumbrella 0.003
108 product_vertical_softbox 0.003
249 Online_marketing_Ad_Stock_lag1 0.002
356 Max Temp 0.002
180 Digital_SMA_5 0.002
316 Other_SMA_3 0.002
57 product_vertical_camerafilmrolls_lag1 0.002
267 Affiliates_EMA_8_lag3 0.002
169 TV_Ad_Stock_lag1 0.002
93 product_vertical_flash_lag1 0.002
167 TV_EMA_8_lag3 0.002
307 Radio_EMA_8_lag3 0.002
194 Sponsorship_lag2 0.002
266 Affiliates_EMA_8_lag2 0.002
70 product_vertical_cameramicrophone_lag2 0.002
363 Min Temp_lag3 0.002
297 Radio_SMA_3_lag1 0.002
191 Digital_Ad_Stock_lag3 0.002
264 Affiliates_EMA_8 0.001
359 Max Temp_lag3 0.001
270 Affiliates_Ad_Stock_lag2 0.001
220 Content Marketing_SMA_5 0.001
321 Other_SMA_5_lag1 0.001
230 Content_Marketing_Ad_Stock_lag2 0.001
304 Radio_EMA_8 0.001
299 Radio_SMA_3_lag3 0.001
291 SEM_Ad_Stock_lag3 0.001
242 Online marketing_SMA_5_lag2 0.001
244 Online marketing_EMA_8 0.001
251 Online_marketing_Ad_Stock_lag3 0.001
205 Sponsorship_EMA_8_lag1 0.001
151 Total_Investment_Ad_Stock_lag3 0.001
176 Digital_SMA_3 0.001
189 Digital_Ad_Stock_lag1 0.001
62 product_vertical_camerahousing_lag2 0.001
95 product_vertical_flash_lag3 0.001
110 product_vertical_softbox_lag2 0.001
144 Total Investment_EMA_8 0.001
148 Total_Investment_Ad_Stock 0.001
160 TV_SMA_5 0.001
166 TV_EMA_8_lag2 0.001
71 product_vertical_cameramicrophone_lag3 0.001
368 Heat Deg Days -0.001
383 Total Snow (cm)_lag3 -0.001
122 product_vertical_telescope_lag2 -0.001
366 Mean Temp_lag2 -0.001
248 Online_marketing_Ad_Stock -0.001
370 Heat Deg Days_lag2 -0.001
272 SEM -0.001
4 Discount% -0.001
285 SEM_EMA_8_lag1 -0.001
164 TV_EMA_8 -0.001
388 Snow on Grnd (cm) -0.001
289 SEM_Ad_Stock_lag1 -0.001
314 Other_lag2 -0.001
313 Other_lag1 -0.001
234 Online marketing_lag2 -0.001
184 Digital_EMA_8 -0.001
308 Radio_Ad_Stock -0.001
305 Radio_EMA_8_lag1 -0.001
174 Digital_lag2 -0.001
309 Radio_Ad_Stock_lag1 -0.001
216 Content Marketing_SMA_3 -0.001
135 Total Investment_lag3 -0.001
218 Content Marketing_SMA_3_lag2 -0.002
319 Other_SMA_3_lag3 -0.002
360 Min Temp -0.002
168 TV_Ad_Stock -0.002
8 deliverybdays -0.002
27 is_cod_lag3 -0.002
15 deliverycdays_lag3 -0.002
181 Digital_SMA_5_lag1 -0.002
329 Other_Ad_Stock_lag1 -0.002
198 Sponsorship_SMA_3_lag2 -0.002
211 Sponsorship_Ad_Stock_lag3 -0.002
393 Sale_lag1 -0.002
326 Other_EMA_8_lag2 -0.002
386 Total Precip (mm)_lag2 -0.002
12 deliverycdays -0.002
106 product_vertical_reflectorumbrella_lag2 -0.002
288 SEM_Ad_Stock -0.002
240 Online marketing_SMA_5 -0.002
215 Content Marketing_lag3 -0.002
320 Other_SMA_5 -0.002
11 deliverybdays_lag3 -0.002
284 SEM_EMA_8 -0.002
3 gmv_lag3 -0.002
278 SEM_SMA_3_lag2 -0.003
38 product_vertical_camerabag_lag2 -0.003
197 Sponsorship_SMA_3_lag1 -0.003
26 is_cod_lag2 -0.003
190 Digital_Ad_Stock_lag2 -0.003
139 Total Investment_SMA_3_lag3 -0.003
225 Content Marketing_EMA_8_lag1 -0.003
228 Content_Marketing_Ad_Stock -0.003
302 Radio_SMA_5_lag2 -0.003
229 Content_Marketing_Ad_Stock_lag1 -0.003
379 Total Rain (mm)_lag3 -0.003
254 Affiliates_lag2 -0.003
381 Total Snow (cm)_lag1 -0.003
217 Content Marketing_SMA_3_lag1 -0.003
17 sla_lag1 -0.004
222 Content Marketing_SMA_5_lag2 -0.004
227 Content Marketing_EMA_8_lag3 -0.004
90 product_vertical_filter_lag2 -0.004
378 Total Rain (mm)_lag2 -0.004
385 Total Precip (mm)_lag1 -0.004
221 Content Marketing_SMA_5_lag1 -0.004
277 SEM_SMA_3_lag1 -0.004
182 Digital_SMA_5_lag2 -0.004
324 Other_EMA_8 -0.004
325 Other_EMA_8_lag1 -0.004
243 Online marketing_SMA_5_lag3 -0.004
199 Sponsorship_SMA_3_lag3 -0.004
328 Other_Ad_Stock -0.004
141 Total Investment_SMA_5_lag1 -0.004
85 product_vertical_extensiontube_lag1 -0.004
377 Total Rain (mm)_lag1 -0.005
318 Other_SMA_3_lag2 -0.005
375 Cool Deg Days_lag3 -0.005
322 Other_SMA_5_lag2 -0.005
323 Other_SMA_5_lag3 -0.005
374 Cool Deg Days_lag2 -0.005
373 Cool Deg Days_lag1 -0.005
387 Total Precip (mm)_lag3 -0.005
79 product_vertical_cameraremotecontrol_lag3 -0.005
236 Online marketing_SMA_3 -0.005
235 Online marketing_lag3 -0.005
78 product_vertical_cameraremotecontrol_lag2 -0.005
256 Affiliates_SMA_3 -0.005
237 Online marketing_SMA_3_lag1 -0.005
201 Sponsorship_SMA_5_lag1 -0.005
257 Affiliates_SMA_3_lag1 -0.005
255 Affiliates_lag3 -0.005
154 TV_lag2 -0.005
183 Digital_SMA_5_lag3 -0.005
91 product_vertical_filter_lag3 -0.005
226 Content Marketing_EMA_8_lag2 -0.006
43 product_vertical_camerabattery_lag3 -0.006
31 is_mass_market_lag3 -0.006
105 product_vertical_reflectorumbrella_lag1 -0.006
281 SEM_SMA_5_lag1 -0.006
395 Sale_lag3 -0.006
102 product_vertical_lens_lag2 -0.006
87 product_vertical_extensiontube_lag3 -0.006
52 product_vertical_cameraeyecup -0.006
390 Snow on Grnd (cm)_lag2 -0.006
84 product_vertical_extensiontube -0.006
295 Radio_lag3 -0.006
296 Radio_SMA_3 -0.006
82 product_vertical_cameratripod_lag2 -0.006
195 Sponsorship_lag3 -0.006
13 deliverycdays_lag1 -0.007
301 Radio_SMA_5_lag1 -0.007
155 TV_lag3 -0.007
262 Affiliates_SMA_5_lag2 -0.007
345 Stock Index_lag1 -0.007
138 Total Investment_SMA_3_lag2 -0.007
219 Content Marketing_SMA_3_lag3 -0.007
2 gmv_lag2 -0.007
392 Sale -0.007
332 NPS -0.007
380 Total Snow (cm) -0.008
123 product_vertical_telescope_lag3 -0.008
60 product_vertical_camerahousing -0.008
51 product_vertical_camerabatterygrip_lag3 -0.008
280 SEM_SMA_5 -0.008
7 Discount%_lag3 -0.008
361 Min Temp_lag1 -0.008
42 product_vertical_camerabattery_lag2 -0.008
35 product_vertical_cameraaccessory_lag3 -0.008
223 Content Marketing_SMA_5_lag3 -0.008
46 product_vertical_camerabatterycharger_lag2 -0.008
9 deliverybdays_lag1 -0.009
162 TV_SMA_5_lag2 -0.009
213 Content Marketing_lag1 -0.009
351 Stock Index_SMA_3_lag3 -0.009
109 product_vertical_softbox_lag1 -0.009
357 Max Temp_lag1 -0.009
68 product_vertical_cameramicrophone -0.009
333 NPS_lag1 -0.009
30 is_mass_market_lag2 -0.009
353 Stock Index_SMA_5_lag1 -0.009
47 product_vertical_camerabatterycharger_lag3 -0.009
39 product_vertical_camerabag_lag3 -0.009
344 Stock Index -0.009
103 product_vertical_lens_lag3 -0.010
114 product_vertical_strap_lag2 -0.010
50 product_vertical_camerabatterygrip_lag2 -0.010
339 NPS_SMA_3_lag3 -0.010
341 NPS_SMA_5_lag1 -0.010
157 TV_SMA_3_lag1 -0.010
34 product_vertical_cameraaccessory_lag2 -0.010
156 TV_SMA_3 -0.011
61 product_vertical_camerahousing_lag1 -0.011
75 product_vertical_cameramount_lag3 -0.011
371 Heat Deg Days_lag3 -0.012
365 Mean Temp_lag1 -0.012
212 Content Marketing -0.012
53 product_vertical_cameraeyecup_lag1 -0.013
200 Sponsorship_SMA_5 -0.013
83 product_vertical_cameratripod_lag3 -0.014
140 Total Investment_SMA_5 -0.014
263 Affiliates_SMA_5_lag3 -0.014
115 product_vertical_strap_lag3 -0.016
163 TV_SMA_5_lag3 -0.018
21 product_procurement_sla_lag1 -0.020
22 product_procurement_sla_lag2 -0.027
350 Stock Index_SMA_3_lag2 -0.027
352 Stock Index_SMA_5 -0.027
338 NPS_SMA_3_lag2 -0.028
340 NPS_SMA_5 -0.028
16 sla -0.035

Plotting the Features in descending order of Importance for cameraaccessory

In [392]:
# Slightly alter the figure size to make it more horizontal.
plt.figure(figsize=(10, 35), dpi=100, facecolor='w', edgecolor='k', frameon='True')
sns.barplot(y='Features', x='Coefficients', palette='husl', data=cameraaccessory_lr_coef_df, estimator=np.sum)
# Automatically adjust subplot params so that the subplotS fits in to the figure area.
plt.tight_layout()

# display the plot
plt.show()
The 5 most important features affecting GMV(Revenue) for cameraaccessory are:
Features Coefficients
is_mass_market 0.061
product_vertical_lens 0.060
product_vertical_cameraaccessory 0.060
product_vertical_camerabattery 0.059
product_vertical_cameratripod 0.059

Building Linear Regression model for gamingaccessory

In [393]:
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
from sklearn.metrics import mean_squared_error

gamingaccessory_dlmul_model = LinearRegression().fit(X_gamingaccessory_dlmul_train, y_gamingaccessory_dlmul_train)
y_gamingaccessory_dlmul_test_pred = gamingaccessory_dlmul_model.predict(X_gamingaccessory_dlmul_test)

print('R2 Score: {}'.format(r2_score(y_gamingaccessory_dlmul_test, y_gamingaccessory_dlmul_test_pred)))
print('Mean Squared Error: {}'.format(mean_squared_error(y_gamingaccessory_dlmul_test, y_gamingaccessory_dlmul_test_pred)))
R2 Score: 0.9324798780623911
Mean Squared Error: 0.10610625019038579
With Simple Linear Regression, we get a r2 score of 0.93 and mse score of 0.11

Building Linear Regression model for gamingaccessory using K-fold Cross Validation

We will use GridSearchCV method and 5 fold cross validation method for our linear regression.

In [394]:
y_gamingaccessory_dlmul = gamingaccessory_dlmul_df.pop('gmv')
X_gamingaccessory_dlmul = gamingaccessory_dlmul_df
In [395]:
# Make cross validated predictions
from sklearn.model_selection import cross_val_score,cross_val_predict
from sklearn import metrics

gamingaccessory_dlmul_model_cv = LinearRegression().fit(X_gamingaccessory_dlmul, y_gamingaccessory_dlmul)
gamingaccessory_dlmul_predictions_cv = cross_val_predict(gamingaccessory_dlmul_model_cv, X_gamingaccessory_dlmul, \
                                                         y_gamingaccessory_dlmul, cv=10)
accuracy = metrics.r2_score(y_gamingaccessory_dlmul, gamingaccessory_dlmul_predictions_cv)
print("Cross-Predicted Accuracy:", accuracy)
print('Mean Squared Error: {}'.format(mean_squared_error(y_gamingaccessory_dlmul, gamingaccessory_dlmul_predictions_cv)))
Cross-Predicted Accuracy: 0.8933946293744466
Mean Squared Error: 0.1066053706255534
With Simple Linear Regression, using cross validation, we get r2 score of 0.89 and mse score of 0.11

Determining Feature Importance for gamingaccessory with model with cv

In [396]:
# linear regression model parameters
#Limiting floats output to 3 decimal points
pd.set_option('display.float_format', lambda x: '{:.3f}'.format(x)) 
pd.set_option('display.precision',1)


gamingaccessory_lr_model_parameters = list(gamingaccessory_dlmul_model_cv.coef_)
gamingaccessory_lr_model_parameters.insert(0, gamingaccessory_dlmul_model_cv.intercept_)
gamingaccessory_lr_model_parameters = [round(x, 3) for x in gamingaccessory_lr_model_parameters]
cols = X_gamingaccessory_dlmul_test.columns
cols = cols.insert(0, "constant")
gamingaccessory_lr_coef = list(zip(cols, gamingaccessory_lr_model_parameters))
gamingaccessory_lr_coef
Out[396]:
[('constant', 0.0),
 ('gmv_lag1', -0.0),
 ('gmv_lag2', -0.002),
 ('gmv_lag3', -0.001),
 ('Discount%', -0.059),
 ('Discount%_lag1', -0.006),
 ('Discount%_lag2', -0.002),
 ('Discount%_lag3', 0.001),
 ('deliverybdays', -0.005),
 ('deliverybdays_lag1', -0.002),
 ('deliverybdays_lag2', -0.0),
 ('deliverybdays_lag3', 0.016),
 ('deliverycdays', -0.004),
 ('deliverycdays_lag1', -0.003),
 ('deliverycdays_lag2', -0.001),
 ('deliverycdays_lag3', 0.019),
 ('sla', 0.024),
 ('sla_lag1', 0.043),
 ('sla_lag2', -0.053),
 ('sla_lag3', -0.005),
 ('product_procurement_sla', 0.009),
 ('product_procurement_sla_lag1', -0.006),
 ('product_procurement_sla_lag2', 0.008),
 ('product_procurement_sla_lag3', 0.014),
 ('is_cod', 0.072),
 ('is_cod_lag1', 0.006),
 ('is_cod_lag2', 0.006),
 ('is_cod_lag3', -0.001),
 ('is_mass_market', 0.082),
 ('is_mass_market_lag1', 0.004),
 ('is_mass_market_lag2', 0.002),
 ('is_mass_market_lag3', -0.006),
 ('product_vertical_gamecontrolmount', -0.0),
 ('product_vertical_gamecontrolmount_lag1', -0.0),
 ('product_vertical_gamecontrolmount_lag2', -0.0),
 ('product_vertical_gamecontrolmount_lag3', -0.0),
 ('product_vertical_gamepad', 0.088),
 ('product_vertical_gamepad_lag1', 0.007),
 ('product_vertical_gamepad_lag2', 0.003),
 ('product_vertical_gamepad_lag3', -0.01),
 ('product_vertical_gamingaccessorykit', 0.07),
 ('product_vertical_gamingaccessorykit_lag1', 0.008),
 ('product_vertical_gamingaccessorykit_lag2', 0.008),
 ('product_vertical_gamingaccessorykit_lag3', -0.01),
 ('product_vertical_gamingadapter', 0.06),
 ('product_vertical_gamingadapter_lag1', 0.008),
 ('product_vertical_gamingadapter_lag2', 0.005),
 ('product_vertical_gamingadapter_lag3', -0.004),
 ('product_vertical_gamingchargingstation', 0.007),
 ('product_vertical_gamingchargingstation_lag1', -0.008),
 ('product_vertical_gamingchargingstation_lag2', -0.001),
 ('product_vertical_gamingchargingstation_lag3', 0.012),
 ('product_vertical_gamingheadset', 0.067),
 ('product_vertical_gamingheadset_lag1', 0.003),
 ('product_vertical_gamingheadset_lag2', 0.008),
 ('product_vertical_gamingheadset_lag3', -0.006),
 ('product_vertical_gamingkeyboard', 0.074),
 ('product_vertical_gamingkeyboard_lag1', 0.001),
 ('product_vertical_gamingkeyboard_lag2', 0.003),
 ('product_vertical_gamingkeyboard_lag3', 0.002),
 ('product_vertical_gamingmemorycard', 0.03),
 ('product_vertical_gamingmemorycard_lag1', -0.012),
 ('product_vertical_gamingmemorycard_lag2', 0.011),
 ('product_vertical_gamingmemorycard_lag3', 0.005),
 ('product_vertical_gamingmouse', 0.085),
 ('product_vertical_gamingmouse_lag1', -0.003),
 ('product_vertical_gamingmouse_lag2', 0.004),
 ('product_vertical_gamingmouse_lag3', -0.002),
 ('product_vertical_gamingmousepad', 0.062),
 ('product_vertical_gamingmousepad_lag1', 0.008),
 ('product_vertical_gamingmousepad_lag2', -0.005),
 ('product_vertical_gamingmousepad_lag3', 0.013),
 ('product_vertical_gamingspeaker', -0.011),
 ('product_vertical_gamingspeaker_lag1', -0.002),
 ('product_vertical_gamingspeaker_lag2', -0.003),
 ('product_vertical_gamingspeaker_lag3', 0.006),
 ('product_vertical_joystickgamingwheel', 0.058),
 ('product_vertical_joystickgamingwheel_lag1', -0.005),
 ('product_vertical_joystickgamingwheel_lag2', -0.0),
 ('product_vertical_joystickgamingwheel_lag3', -0.008),
 ('product_vertical_motioncontroller', 0.052),
 ('product_vertical_motioncontroller_lag1', -0.0),
 ('product_vertical_motioncontroller_lag2', 0.018),
 ('product_vertical_motioncontroller_lag3', -0.003),
 ('product_vertical_tvoutcableaccessory', 0.068),
 ('product_vertical_tvoutcableaccessory_lag1', 0.01),
 ('product_vertical_tvoutcableaccessory_lag2', 0.007),
 ('product_vertical_tvoutcableaccessory_lag3', -0.009),
 ('payday_week', 0.0),
 ('payday_week_lag1', 0.0),
 ('payday_week_lag2', 0.0),
 ('payday_week_lag3', 0.0),
 ('holiday_week', 0.0),
 ('holiday_week_lag1', 0.0),
 ('holiday_week_lag2', 0.0),
 ('holiday_week_lag3', 0.0),
 ('Total Investment', -0.0),
 ('Total Investment_lag1', 0.001),
 ('Total Investment_lag2', 0.011),
 ('Total Investment_lag3', 0.005),
 ('Total Investment_SMA_3', 0.007),
 ('Total Investment_SMA_3_lag1', 0.001),
 ('Total Investment_SMA_3_lag2', -0.013),
 ('Total Investment_SMA_3_lag3', 0.002),
 ('Total Investment_SMA_5', -0.013),
 ('Total Investment_SMA_5_lag1', -0.002),
 ('Total Investment_SMA_5_lag2', -0.003),
 ('Total Investment_SMA_5_lag3', 0.014),
 ('Total Investment_EMA_8', 0.001),
 ('Total Investment_EMA_8_lag1', -0.002),
 ('Total Investment_EMA_8_lag2', 0.003),
 ('Total Investment_EMA_8_lag3', 0.005),
 ('Total_Investment_Ad_Stock', 0.001),
 ('Total_Investment_Ad_Stock_lag1', -0.002),
 ('Total_Investment_Ad_Stock_lag2', 0.002),
 ('Total_Investment_Ad_Stock_lag3', 0.002),
 ('TV', 0.012),
 ('TV_lag1', -0.001),
 ('TV_lag2', 0.013),
 ('TV_lag3', -0.002),
 ('TV_SMA_3', 0.01),
 ('TV_SMA_3_lag1', -0.013),
 ('TV_SMA_3_lag2', -0.011),
 ('TV_SMA_3_lag3', 0.002),
 ('TV_SMA_5', -0.014),
 ('TV_SMA_5_lag1', -0.012),
 ('TV_SMA_5_lag2', 0.014),
 ('TV_SMA_5_lag3', -0.013),
 ('TV_EMA_8', 0.001),
 ('TV_EMA_8_lag1', -0.0),
 ('TV_EMA_8_lag2', 0.003),
 ('TV_EMA_8_lag3', 0.003),
 ('TV_Ad_Stock', -0.001),
 ('TV_Ad_Stock_lag1', -0.005),
 ('TV_Ad_Stock_lag2', 0.001),
 ('TV_Ad_Stock_lag3', 0.002),
 ('Digital', -0.008),
 ('Digital_lag1', 0.004),
 ('Digital_lag2', 0.003),
 ('Digital_lag3', 0.017),
 ('Digital_SMA_3', 0.007),
 ('Digital_SMA_3_lag1', 0.003),
 ('Digital_SMA_3_lag2', -0.004),
 ('Digital_SMA_3_lag3', 0.005),
 ('Digital_SMA_5', 0.002),
 ('Digital_SMA_5_lag1', -0.004),
 ('Digital_SMA_5_lag2', 0.0),
 ('Digital_SMA_5_lag3', -0.005),
 ('Digital_EMA_8', 0.001),
 ('Digital_EMA_8_lag1', -0.002),
 ('Digital_EMA_8_lag2', -0.002),
 ('Digital_EMA_8_lag3', 0.006),
 ('Digital_Ad_Stock', 0.003),
 ('Digital_Ad_Stock_lag1', -0.001),
 ('Digital_Ad_Stock_lag2', -0.002),
 ('Digital_Ad_Stock_lag3', 0.005),
 ('Sponsorship', 0.003),
 ('Sponsorship_lag1', 0.004),
 ('Sponsorship_lag2', 0.014),
 ('Sponsorship_lag3', 0.002),
 ('Sponsorship_SMA_3', 0.009),
 ('Sponsorship_SMA_3_lag1', -0.003),
 ('Sponsorship_SMA_3_lag2', -0.019),
 ('Sponsorship_SMA_3_lag3', -0.001),
 ('Sponsorship_SMA_5', -0.018),
 ('Sponsorship_SMA_5_lag1', -0.009),
 ('Sponsorship_SMA_5_lag2', -0.001),
 ('Sponsorship_SMA_5_lag3', 0.006),
 ('Sponsorship_EMA_8', -0.0),
 ('Sponsorship_EMA_8_lag1', -0.003),
 ('Sponsorship_EMA_8_lag2', 0.002),
 ('Sponsorship_EMA_8_lag3', 0.004),
 ('Sponsorship_Ad_Stock', -0.0),
 ('Sponsorship_Ad_Stock_lag1', -0.005),
 ('Sponsorship_Ad_Stock_lag2', -0.0),
 ('Sponsorship_Ad_Stock_lag3', -0.0),
 ('Content Marketing', -0.021),
 ('Content Marketing_lag1', 0.012),
 ('Content Marketing_lag2', -0.007),
 ('Content Marketing_lag3', 0.002),
 ('Content Marketing_SMA_3', -0.004),
 ('Content Marketing_SMA_3_lag1', 0.006),
 ('Content Marketing_SMA_3_lag2', -0.003),
 ('Content Marketing_SMA_3_lag3', -0.003),
 ('Content Marketing_SMA_5', -0.003),
 ('Content Marketing_SMA_5_lag1', 0.007),
 ('Content Marketing_SMA_5_lag2', -0.004),
 ('Content Marketing_SMA_5_lag3', -0.011),
 ('Content Marketing_EMA_8', -0.004),
 ('Content Marketing_EMA_8_lag1', 0.005),
 ('Content Marketing_EMA_8_lag2', -0.006),
 ('Content Marketing_EMA_8_lag3', -0.007),
 ('Content_Marketing_Ad_Stock', -0.001),
 ('Content_Marketing_Ad_Stock_lag1', 0.006),
 ('Content_Marketing_Ad_Stock_lag2', -0.002),
 ('Content_Marketing_Ad_Stock_lag3', -0.001),
 ('Online marketing', 0.011),
 ('Online marketing_lag1', 0.003),
 ('Online marketing_lag2', 0.012),
 ('Online marketing_lag3', -0.004),
 ('Online marketing_SMA_3', 0.01),
 ('Online marketing_SMA_3_lag1', -0.008),
 ('Online marketing_SMA_3_lag2', -0.01),
 ('Online marketing_SMA_3_lag3', 0.001),
 ('Online marketing_SMA_5', -0.009),
 ('Online marketing_SMA_5_lag1', -0.009),
 ('Online marketing_SMA_5_lag2', 0.006),
 ('Online marketing_SMA_5_lag3', -0.004),
 ('Online marketing_EMA_8', 0.002),
 ('Online marketing_EMA_8_lag1', -0.0),
 ('Online marketing_EMA_8_lag2', 0.003),
 ('Online marketing_EMA_8_lag3', 0.003),
 ('Online_marketing_Ad_Stock', 0.001),
 ('Online_marketing_Ad_Stock_lag1', -0.003),
 ('Online_marketing_Ad_Stock_lag2', 0.0),
 ('Online_marketing_Ad_Stock_lag3', -0.0),
 ('Affiliates', 0.011),
 ('Affiliates_lag1', 0.003),
 ('Affiliates_lag2', 0.012),
 ('Affiliates_lag3', -0.004),
 ('Affiliates_SMA_3', 0.01),
 ('Affiliates_SMA_3_lag1', -0.008),
 ('Affiliates_SMA_3_lag2', -0.004),
 ('Affiliates_SMA_3_lag3', -0.001),
 ('Affiliates_SMA_5', -0.002),
 ('Affiliates_SMA_5_lag1', -0.009),
 ('Affiliates_SMA_5_lag2', 0.009),
 ('Affiliates_SMA_5_lag3', -0.014),
 ('Affiliates_EMA_8', 0.002),
 ('Affiliates_EMA_8_lag1', 0.0),
 ('Affiliates_EMA_8_lag2', 0.002),
 ('Affiliates_EMA_8_lag3', 0.001),
 ('Affiliates_Ad_Stock', 0.002),
 ('Affiliates_Ad_Stock_lag1', -0.002),
 ('Affiliates_Ad_Stock_lag2', 0.0),
 ('Affiliates_Ad_Stock_lag3', -0.002),
 ('SEM', -0.012),
 ('SEM_lag1', 0.008),
 ('SEM_lag2', 0.007),
 ('SEM_lag3', 0.005),
 ('SEM_SMA_3', 0.008),
 ('SEM_SMA_3_lag1', 0.002),
 ('SEM_SMA_3_lag2', -0.011),
 ('SEM_SMA_3_lag3', 0.003),
 ('SEM_SMA_5', -0.006),
 ('SEM_SMA_5_lag1', -0.002),
 ('SEM_SMA_5_lag2', -0.003),
 ('SEM_SMA_5_lag3', 0.004),
 ('SEM_EMA_8', 0.0),
 ('SEM_EMA_8_lag1', -0.001),
 ('SEM_EMA_8_lag2', -0.001),
 ('SEM_EMA_8_lag3', 0.005),
 ('SEM_Ad_Stock', 0.001),
 ('SEM_Ad_Stock_lag1', 0.0),
 ('SEM_Ad_Stock_lag2', -0.002),
 ('SEM_Ad_Stock_lag3', 0.0),
 ('Radio', 0.002),
 ('Radio_lag1', -0.012),
 ('Radio_lag2', 0.011),
 ('Radio_lag3', -0.01),
 ('Radio_SMA_3', -0.013),
 ('Radio_SMA_3_lag1', 0.004),
 ('Radio_SMA_3_lag2', -0.004),
 ('Radio_SMA_3_lag3', 0.006),
 ('Radio_SMA_5', -0.003),
 ('Radio_SMA_5_lag1', 0.006),
 ('Radio_SMA_5_lag2', -0.007),
 ('Radio_SMA_5_lag3', 0.012),
 ('Radio_EMA_8', 0.002),
 ('Radio_EMA_8_lag1', -0.001),
 ('Radio_EMA_8_lag2', -0.004),
 ('Radio_EMA_8_lag3', -0.002),
 ('Radio_Ad_Stock', 0.001),
 ('Radio_Ad_Stock_lag1', -0.001),
 ('Radio_Ad_Stock_lag2', -0.003),
 ('Radio_Ad_Stock_lag3', 0.0),
 ('Other', -0.007),
 ('Other_lag1', -0.01),
 ('Other_lag2', -0.004),
 ('Other_lag3', 0.021),
 ('Other_SMA_3', 0.0),
 ('Other_SMA_3_lag1', 0.011),
 ('Other_SMA_3_lag2', -0.005),
 ('Other_SMA_3_lag3', -0.0),
 ('Other_SMA_5', -0.007),
 ('Other_SMA_5_lag1', -0.0),
 ('Other_SMA_5_lag2', -0.003),
 ('Other_SMA_5_lag3', -0.006),
 ('Other_EMA_8', -0.004),
 ('Other_EMA_8_lag1', -0.0),
 ('Other_EMA_8_lag2', 0.001),
 ('Other_EMA_8_lag3', 0.005),
 ('Other_Ad_Stock', -0.006),
 ('Other_Ad_Stock_lag1', -0.0),
 ('Other_Ad_Stock_lag2', 0.0),
 ('Other_Ad_Stock_lag3', 0.004),
 ('NPS', -0.008),
 ('NPS_lag1', -0.007),
 ('NPS_lag2', -0.0),
 ('NPS_lag3', 0.004),
 ('NPS_SMA_3', 0.0),
 ('NPS_SMA_3_lag1', 0.004),
 ('NPS_SMA_3_lag2', -0.02),
 ('NPS_SMA_3_lag3', 0.002),
 ('NPS_SMA_5', -0.02),
 ('NPS_SMA_5_lag1', 0.002),
 ('NPS_SMA_5_lag2', -0.01),
 ('NPS_SMA_5_lag3', 0.029),
 ('Stock Index', 0.004),
 ('Stock Index_lag1', -0.006),
 ('Stock Index_lag2', 0.001),
 ('Stock Index_lag3', 0.004),
 ('Stock Index_SMA_3', 0.001),
 ('Stock Index_SMA_3_lag1', 0.004),
 ('Stock Index_SMA_3_lag2', -0.02),
 ('Stock Index_SMA_3_lag3', 0.002),
 ('Stock Index_SMA_5', -0.02),
 ('Stock Index_SMA_5_lag1', 0.002),
 ('Stock Index_SMA_5_lag2', -0.01),
 ('Stock Index_SMA_5_lag3', 0.029),
 ('Max Temp', 0.003),
 ('Max Temp_lag1', -0.012),
 ('Max Temp_lag2', -0.001),
 ('Max Temp_lag3', 0.001),
 ('Min Temp', 0.006),
 ('Min Temp_lag1', -0.015),
 ('Min Temp_lag2', 0.004),
 ('Min Temp_lag3', -0.007),
 ('Mean Temp', 0.012),
 ('Mean Temp_lag1', -0.003),
 ('Mean Temp_lag2', -0.014),
 ('Mean Temp_lag3', 0.015),
 ('Heat Deg Days', -0.001),
 ('Heat Deg Days_lag1', 0.013),
 ('Heat Deg Days_lag2', -0.005),
 ('Heat Deg Days_lag3', 0.001),
 ('Cool Deg Days', 0.015),
 ('Cool Deg Days_lag1', -0.021),
 ('Cool Deg Days_lag2', 0.005),
 ('Cool Deg Days_lag3', -0.008),
 ('Total Rain (mm)', 0.008),
 ('Total Rain (mm)_lag1', -0.008),
 ('Total Rain (mm)_lag2', 0.004),
 ('Total Rain (mm)_lag3', 0.003),
 ('Total Snow (cm)', 0.005),
 ('Total Snow (cm)_lag1', -0.003),
 ('Total Snow (cm)_lag2', -0.004),
 ('Total Snow (cm)_lag3', -0.009),
 ('Total Precip (mm)', 0.01),
 ('Total Precip (mm)_lag1', -0.011),
 ('Total Precip (mm)_lag2', 0.009),
 ('Total Precip (mm)_lag3', -0.0),
 ('Snow on Grnd (cm)', 0.001),
 ('Snow on Grnd (cm)_lag1', 0.001),
 ('Snow on Grnd (cm)_lag2', -0.005),
 ('Snow on Grnd (cm)_lag3', 0.01),
 ('Sale', 0.006),
 ('Sale_lag1', -0.002),
 ('Sale_lag2', -0.001),
 ('Sale_lag3', 0.002)]
In [397]:
gamingaccessory_lr_coef_df = pd.DataFrame(gamingaccessory_lr_coef)
col_rename = {0:'Features',1: 'Coefficients'}
gamingaccessory_lr_coef_df = gamingaccessory_lr_coef_df.rename(columns=col_rename)
gamingaccessory_lr_coef_df = gamingaccessory_lr_coef_df.iloc[1:,:]
gamingaccessory_lr_coef_df = gamingaccessory_lr_coef_df.loc[gamingaccessory_lr_coef_df['Coefficients']!=0.0]
gamingaccessory_lr_coef_df = gamingaccessory_lr_coef_df.sort_values(by=['Coefficients'], ascending = False)
gamingaccessory_lr_coef_df
Out[397]:
Features Coefficients
36 product_vertical_gamepad 0.088
64 product_vertical_gamingmouse 0.085
28 is_mass_market 0.082
56 product_vertical_gamingkeyboard 0.074
24 is_cod 0.072
40 product_vertical_gamingaccessorykit 0.070
84 product_vertical_tvoutcableaccessory 0.068
52 product_vertical_gamingheadset 0.067
68 product_vertical_gamingmousepad 0.062
44 product_vertical_gamingadapter 0.060
76 product_vertical_joystickgamingwheel 0.058
80 product_vertical_motioncontroller 0.052
17 sla_lag1 0.043
60 product_vertical_gamingmemorycard 0.030
319 Stock Index_SMA_5_lag3 0.029
307 NPS_SMA_5_lag3 0.029
16 sla 0.024
279 Other_lag3 0.021
15 deliverycdays_lag3 0.019
82 product_vertical_motioncontroller_lag2 0.018
139 Digital_lag3 0.017
11 deliverybdays_lag3 0.016
336 Cool Deg Days 0.015
331 Mean Temp_lag3 0.015
23 product_procurement_sla_lag3 0.014
158 Sponsorship_lag2 0.014
126 TV_SMA_5_lag2 0.014
107 Total Investment_SMA_5_lag3 0.014
333 Heat Deg Days_lag1 0.013
71 product_vertical_gamingmousepad_lag3 0.013
118 TV_lag2 0.013
328 Mean Temp 0.012
51 product_vertical_gamingchargingstation_lag3 0.012
198 Online marketing_lag2 0.012
116 TV 0.012
218 Affiliates_lag2 0.012
267 Radio_SMA_5_lag3 0.012
177 Content Marketing_lag1 0.012
281 Other_SMA_3_lag1 0.011
216 Affiliates 0.011
98 Total Investment_lag2 0.011
258 Radio_lag2 0.011
196 Online marketing 0.011
62 product_vertical_gamingmemorycard_lag2 0.011
85 product_vertical_tvoutcableaccessory_lag1 0.010
220 Affiliates_SMA_3 0.010
355 Snow on Grnd (cm)_lag3 0.010
348 Total Precip (mm) 0.010
200 Online marketing_SMA_3 0.010
120 TV_SMA_3 0.010
350 Total Precip (mm)_lag2 0.009
226 Affiliates_SMA_5_lag2 0.009
20 product_procurement_sla 0.009
160 Sponsorship_SMA_3 0.009
340 Total Rain (mm) 0.008
69 product_vertical_gamingmousepad_lag1 0.008
45 product_vertical_gamingadapter_lag1 0.008
22 product_procurement_sla_lag2 0.008
240 SEM_SMA_3 0.008
54 product_vertical_gamingheadset_lag2 0.008
237 SEM_lag1 0.008
41 product_vertical_gamingaccessorykit_lag1 0.008
42 product_vertical_gamingaccessorykit_lag2 0.008
86 product_vertical_tvoutcableaccessory_lag2 0.007
185 Content Marketing_SMA_5_lag1 0.007
100 Total Investment_SMA_3 0.007
140 Digital_SMA_3 0.007
238 SEM_lag2 0.007
37 product_vertical_gamepad_lag1 0.007
48 product_vertical_gamingchargingstation 0.007
356 Sale 0.006
181 Content Marketing_SMA_3_lag1 0.006
25 is_cod_lag1 0.006
167 Sponsorship_SMA_5_lag3 0.006
193 Content_Marketing_Ad_Stock_lag1 0.006
263 Radio_SMA_3_lag3 0.006
265 Radio_SMA_5_lag1 0.006
75 product_vertical_gamingspeaker_lag3 0.006
206 Online marketing_SMA_5_lag2 0.006
151 Digital_EMA_8_lag3 0.006
26 is_cod_lag2 0.006
324 Min Temp 0.006
143 Digital_SMA_3_lag3 0.005
251 SEM_EMA_8_lag3 0.005
338 Cool Deg Days_lag2 0.005
111 Total Investment_EMA_8_lag3 0.005
344 Total Snow (cm) 0.005
189 Content Marketing_EMA_8_lag1 0.005
46 product_vertical_gamingadapter_lag2 0.005
63 product_vertical_gamingmemorycard_lag3 0.005
155 Digital_Ad_Stock_lag3 0.005
239 SEM_lag3 0.005
291 Other_EMA_8_lag3 0.005
99 Total Investment_lag3 0.005
29 is_mass_market_lag1 0.004
299 NPS_lag3 0.004
137 Digital_lag1 0.004
171 Sponsorship_EMA_8_lag3 0.004
301 NPS_SMA_3_lag1 0.004
342 Total Rain (mm)_lag2 0.004
295 Other_Ad_Stock_lag3 0.004
308 Stock Index 0.004
247 SEM_SMA_5_lag3 0.004
311 Stock Index_lag3 0.004
313 Stock Index_SMA_3_lag1 0.004
66 product_vertical_gamingmouse_lag2 0.004
326 Min Temp_lag2 0.004
157 Sponsorship_lag1 0.004
261 Radio_SMA_3_lag1 0.004
343 Total Rain (mm)_lag3 0.003
141 Digital_SMA_3_lag1 0.003
138 Digital_lag2 0.003
217 Affiliates_lag1 0.003
152 Digital_Ad_Stock 0.003
156 Sponsorship 0.003
210 Online marketing_EMA_8_lag2 0.003
131 TV_EMA_8_lag3 0.003
130 TV_EMA_8_lag2 0.003
320 Max Temp 0.003
211 Online marketing_EMA_8_lag3 0.003
38 product_vertical_gamepad_lag2 0.003
243 SEM_SMA_3_lag3 0.003
110 Total Investment_EMA_8_lag2 0.003
58 product_vertical_gamingkeyboard_lag2 0.003
197 Online marketing_lag1 0.003
53 product_vertical_gamingheadset_lag1 0.003
159 Sponsorship_lag3 0.002
232 Affiliates_Ad_Stock 0.002
230 Affiliates_EMA_8_lag2 0.002
228 Affiliates_EMA_8 0.002
208 Online marketing_EMA_8 0.002
170 Sponsorship_EMA_8_lag2 0.002
241 SEM_SMA_3_lag1 0.002
256 Radio 0.002
179 Content Marketing_lag3 0.002
359 Sale_lag3 0.002
268 Radio_EMA_8 0.002
59 product_vertical_gamingkeyboard_lag3 0.002
115 Total_Investment_Ad_Stock_lag3 0.002
103 Total Investment_SMA_3_lag3 0.002
305 NPS_SMA_5_lag1 0.002
315 Stock Index_SMA_3_lag3 0.002
303 NPS_SMA_3_lag3 0.002
317 Stock Index_SMA_5_lag1 0.002
123 TV_SMA_3_lag3 0.002
114 Total_Investment_Ad_Stock_lag2 0.002
135 TV_Ad_Stock_lag3 0.002
144 Digital_SMA_5 0.002
30 is_mass_market_lag2 0.002
231 Affiliates_EMA_8_lag3 0.001
97 Total Investment_lag1 0.001
7 Discount%_lag3 0.001
108 Total Investment_EMA_8 0.001
312 Stock Index_SMA_3 0.001
353 Snow on Grnd (cm)_lag1 0.001
101 Total Investment_SMA_3_lag1 0.001
212 Online_marketing_Ad_Stock 0.001
352 Snow on Grnd (cm) 0.001
57 product_vertical_gamingkeyboard_lag1 0.001
335 Heat Deg Days_lag3 0.001
323 Max Temp_lag3 0.001
112 Total_Investment_Ad_Stock 0.001
203 Online marketing_SMA_3_lag3 0.001
310 Stock Index_lag2 0.001
148 Digital_EMA_8 0.001
272 Radio_Ad_Stock 0.001
252 SEM_Ad_Stock 0.001
134 TV_Ad_Stock_lag2 0.001
290 Other_EMA_8_lag2 0.001
128 TV_EMA_8 0.001
132 TV_Ad_Stock -0.001
14 deliverycdays_lag2 -0.001
27 is_cod_lag3 -0.001
273 Radio_Ad_Stock_lag1 -0.001
163 Sponsorship_SMA_3_lag3 -0.001
50 product_vertical_gamingchargingstation_lag2 -0.001
166 Sponsorship_SMA_5_lag2 -0.001
249 SEM_EMA_8_lag1 -0.001
332 Heat Deg Days -0.001
269 Radio_EMA_8_lag1 -0.001
322 Max Temp_lag2 -0.001
358 Sale_lag2 -0.001
117 TV_lag1 -0.001
195 Content_Marketing_Ad_Stock_lag3 -0.001
223 Affiliates_SMA_3_lag3 -0.001
250 SEM_EMA_8_lag2 -0.001
192 Content_Marketing_Ad_Stock -0.001
3 gmv_lag3 -0.001
153 Digital_Ad_Stock_lag1 -0.001
357 Sale_lag1 -0.002
245 SEM_SMA_5_lag1 -0.002
233 Affiliates_Ad_Stock_lag1 -0.002
271 Radio_EMA_8_lag3 -0.002
235 Affiliates_Ad_Stock_lag3 -0.002
254 SEM_Ad_Stock_lag2 -0.002
224 Affiliates_SMA_5 -0.002
2 gmv_lag2 -0.002
154 Digital_Ad_Stock_lag2 -0.002
194 Content_Marketing_Ad_Stock_lag2 -0.002
6 Discount%_lag2 -0.002
9 deliverybdays_lag1 -0.002
105 Total Investment_SMA_5_lag1 -0.002
150 Digital_EMA_8_lag2 -0.002
149 Digital_EMA_8_lag1 -0.002
113 Total_Investment_Ad_Stock_lag1 -0.002
109 Total Investment_EMA_8_lag1 -0.002
119 TV_lag3 -0.002
73 product_vertical_gamingspeaker_lag1 -0.002
67 product_vertical_gamingmouse_lag3 -0.002
184 Content Marketing_SMA_5 -0.003
286 Other_SMA_5_lag2 -0.003
182 Content Marketing_SMA_3_lag2 -0.003
169 Sponsorship_EMA_8_lag1 -0.003
274 Radio_Ad_Stock_lag2 -0.003
264 Radio_SMA_5 -0.003
246 SEM_SMA_5_lag2 -0.003
183 Content Marketing_SMA_3_lag3 -0.003
161 Sponsorship_SMA_3_lag1 -0.003
83 product_vertical_motioncontroller_lag3 -0.003
213 Online_marketing_Ad_Stock_lag1 -0.003
13 deliverycdays_lag1 -0.003
345 Total Snow (cm)_lag1 -0.003
106 Total Investment_SMA_5_lag2 -0.003
65 product_vertical_gamingmouse_lag1 -0.003
329 Mean Temp_lag1 -0.003
74 product_vertical_gamingspeaker_lag2 -0.003
47 product_vertical_gamingadapter_lag3 -0.004
142 Digital_SMA_3_lag2 -0.004
288 Other_EMA_8 -0.004
346 Total Snow (cm)_lag2 -0.004
145 Digital_SMA_5_lag1 -0.004
270 Radio_EMA_8_lag2 -0.004
12 deliverycdays -0.004
278 Other_lag2 -0.004
222 Affiliates_SMA_3_lag2 -0.004
207 Online marketing_SMA_5_lag3 -0.004
199 Online marketing_lag3 -0.004
262 Radio_SMA_3_lag2 -0.004
188 Content Marketing_EMA_8 -0.004
180 Content Marketing_SMA_3 -0.004
186 Content Marketing_SMA_5_lag2 -0.004
219 Affiliates_lag3 -0.004
19 sla_lag3 -0.005
133 TV_Ad_Stock_lag1 -0.005
70 product_vertical_gamingmousepad_lag2 -0.005
282 Other_SMA_3_lag2 -0.005
77 product_vertical_joystickgamingwheel_lag1 -0.005
173 Sponsorship_Ad_Stock_lag1 -0.005
334 Heat Deg Days_lag2 -0.005
354 Snow on Grnd (cm)_lag2 -0.005
8 deliverybdays -0.005
147 Digital_SMA_5_lag3 -0.005
309 Stock Index_lag1 -0.006
244 SEM_SMA_5 -0.006
190 Content Marketing_EMA_8_lag2 -0.006
55 product_vertical_gamingheadset_lag3 -0.006
31 is_mass_market_lag3 -0.006
287 Other_SMA_5_lag3 -0.006
5 Discount%_lag1 -0.006
292 Other_Ad_Stock -0.006
21 product_procurement_sla_lag1 -0.006
327 Min Temp_lag3 -0.007
191 Content Marketing_EMA_8_lag3 -0.007
276 Other -0.007
284 Other_SMA_5 -0.007
266 Radio_SMA_5_lag2 -0.007
297 NPS_lag1 -0.007
178 Content Marketing_lag2 -0.007
79 product_vertical_joystickgamingwheel_lag3 -0.008
201 Online marketing_SMA_3_lag1 -0.008
221 Affiliates_SMA_3_lag1 -0.008
341 Total Rain (mm)_lag1 -0.008
49 product_vertical_gamingchargingstation_lag1 -0.008
339 Cool Deg Days_lag3 -0.008
296 NPS -0.008
136 Digital -0.008
87 product_vertical_tvoutcableaccessory_lag3 -0.009
347 Total Snow (cm)_lag3 -0.009
205 Online marketing_SMA_5_lag1 -0.009
204 Online marketing_SMA_5 -0.009
165 Sponsorship_SMA_5_lag1 -0.009
225 Affiliates_SMA_5_lag1 -0.009
43 product_vertical_gamingaccessorykit_lag3 -0.010
39 product_vertical_gamepad_lag3 -0.010
318 Stock Index_SMA_5_lag2 -0.010
277 Other_lag1 -0.010
202 Online marketing_SMA_3_lag2 -0.010
306 NPS_SMA_5_lag2 -0.010
259 Radio_lag3 -0.010
122 TV_SMA_3_lag2 -0.011
242 SEM_SMA_3_lag2 -0.011
72 product_vertical_gamingspeaker -0.011
349 Total Precip (mm)_lag1 -0.011
187 Content Marketing_SMA_5_lag3 -0.011
125 TV_SMA_5_lag1 -0.012
257 Radio_lag1 -0.012
61 product_vertical_gamingmemorycard_lag1 -0.012
236 SEM -0.012
321 Max Temp_lag1 -0.012
127 TV_SMA_5_lag3 -0.013
121 TV_SMA_3_lag1 -0.013
104 Total Investment_SMA_5 -0.013
102 Total Investment_SMA_3_lag2 -0.013
260 Radio_SMA_3 -0.013
124 TV_SMA_5 -0.014
330 Mean Temp_lag2 -0.014
227 Affiliates_SMA_5_lag3 -0.014
325 Min Temp_lag1 -0.015
164 Sponsorship_SMA_5 -0.018
162 Sponsorship_SMA_3_lag2 -0.019
316 Stock Index_SMA_5 -0.020
302 NPS_SMA_3_lag2 -0.020
304 NPS_SMA_5 -0.020
314 Stock Index_SMA_3_lag2 -0.020
176 Content Marketing -0.021
337 Cool Deg Days_lag1 -0.021
18 sla_lag2 -0.053
4 Discount% -0.059

Plotting the Features in descending order of Importance for gamingaccessory

In [398]:
# Slightly alter the figure size to make it more horizontal.
plt.figure(figsize=(10, 35), dpi=100, facecolor='w', edgecolor='k', frameon='True')
sns.barplot(y='Features', x='Coefficients', palette='husl', data=gamingaccessory_lr_coef_df, estimator=np.sum)
# Automatically adjust subplot params so that the subplotS fits in to the figure area.
plt.tight_layout()

# display the plot
plt.show()
The 5 most important features affecting GMV(Revenue) for gamingaccessory are:
Features Coefficients
product_vertical_gamepad 0.088
product_vertical_gamingmouse 0.085
is_mass_market 0.082
product_vertical_gamingkeyboard 0.074
is_cod 0.072

Building Linear Regression model for homeaudio

In [399]:
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
from sklearn.metrics import mean_squared_error

homeaudio_dlmul_model = LinearRegression().fit(X_homeaudio_dlmul_train, y_homeaudio_dlmul_train)
y_homeaudio_dlmul_test_pred = homeaudio_dlmul_model.predict(X_homeaudio_dlmul_test)

print('R2 Score: {}'.format(r2_score(y_homeaudio_dlmul_test, y_homeaudio_dlmul_test_pred)))
print('Mean Squared Error: {}'.format(mean_squared_error(y_homeaudio_dlmul_test, y_homeaudio_dlmul_test_pred)))
R2 Score: -0.22991361911881447
Mean Squared Error: 0.25736843256487285
With Simple Linear Regression, we get a r2 score of -0.23 and mse of 0.26

Here R2 is negative which signifies that the chosen model does not follow the trend of the data, so fits worse than a horizontal line. It simply means the chosen model (with its constraints) fits the data really poorly.

Building Linear Regression model for homeaudio using K-fold Cross Validation

We will use GridSearchCV method and 5 fold cross validation method for our linear regression.

In [400]:
y_homeaudio_dlmul = homeaudio_dlmul_df.pop('gmv')
X_homeaudio_dlmul = homeaudio_dlmul_df
In [401]:
# Make cross validated predictions
from sklearn.model_selection import cross_val_score,cross_val_predict
from sklearn import metrics

homeaudio_model_dlmul_cv = LinearRegression().fit(X_homeaudio_dlmul, y_homeaudio_dlmul)
homeaudio_dlmul_predictions_cv = cross_val_predict(homeaudio_model_dlmul_cv, X_homeaudio_dlmul, y_homeaudio_dlmul, cv=5)
accuracy = metrics.r2_score(y_homeaudio_dlmul, homeaudio_dlmul_predictions_cv)
print("Cross-Predicted Accuracy:", accuracy)
print('Mean Squared Error: {}'.format(mean_squared_error(y_homeaudio_dlmul, homeaudio_dlmul_predictions_cv)))
Cross-Predicted Accuracy: 0.5730040252945725
Mean Squared Error: 0.42699597470542755
With Simple Linear Regression, using cross validation, we get r2 score of 0.57 and mse score of 0.43

Determining Feature Importance for homeaudio with model with cv

In [402]:
# linear regression model parameters
#Limiting floats output to 3 decimal points
pd.set_option('display.float_format', lambda x: '{:.3f}'.format(x)) 
pd.set_option('display.precision',1)


homeaudio_lr_model_parameters = list(homeaudio_model_dlmul_cv.coef_)
homeaudio_lr_model_parameters.insert(0, homeaudio_model_dlmul_cv.intercept_)
homeaudio_lr_model_parameters = [round(x, 3) for x in homeaudio_lr_model_parameters]
cols = X_homeaudio_dlmul_test.columns
cols = cols.insert(0, "constant")
homeaudio_lr_coef = list(zip(cols, homeaudio_lr_model_parameters))
homeaudio_lr_coef
Out[402]:
[('constant', -0.0),
 ('gmv_lag1', -0.005),
 ('gmv_lag2', 0.001),
 ('gmv_lag3', -0.002),
 ('Discount%', 0.101),
 ('Discount%_lag1', 0.001),
 ('Discount%_lag2', -0.001),
 ('Discount%_lag3', 0.003),
 ('deliverybdays', -0.002),
 ('deliverybdays_lag1', 0.016),
 ('deliverybdays_lag2', -0.006),
 ('deliverybdays_lag3', -0.007),
 ('deliverycdays', -0.002),
 ('deliverycdays_lag1', 0.017),
 ('deliverycdays_lag2', -0.005),
 ('deliverycdays_lag3', -0.009),
 ('sla', -0.05),
 ('sla_lag1', 0.008),
 ('sla_lag2', 0.002),
 ('sla_lag3', -0.005),
 ('product_procurement_sla', 0.039),
 ('product_procurement_sla_lag1', 0.014),
 ('product_procurement_sla_lag2', -0.012),
 ('product_procurement_sla_lag3', 0.012),
 ('is_cod', 0.112),
 ('is_cod_lag1', -0.011),
 ('is_cod_lag2', 0.006),
 ('is_cod_lag3', -0.003),
 ('is_mass_market', 0.133),
 ('is_mass_market_lag1', -0.015),
 ('is_mass_market_lag2', 0.002),
 ('is_mass_market_lag3', -0.004),
 ('product_vertical_djcontroller', 0.036),
 ('product_vertical_djcontroller_lag1', 0.001),
 ('product_vertical_djcontroller_lag2', -0.006),
 ('product_vertical_djcontroller_lag3', -0.008),
 ('product_vertical_dock', 0.06),
 ('product_vertical_dock_lag1', -0.011),
 ('product_vertical_dock_lag2', 0.005),
 ('product_vertical_dock_lag3', -0.005),
 ('product_vertical_dockingstation', -0.004),
 ('product_vertical_dockingstation_lag1', 0.038),
 ('product_vertical_dockingstation_lag2', -0.016),
 ('product_vertical_dockingstation_lag3', 0.011),
 ('product_vertical_fmradio', 0.122),
 ('product_vertical_fmradio_lag1', -0.011),
 ('product_vertical_fmradio_lag2', 0.001),
 ('product_vertical_fmradio_lag3', -0.005),
 ('product_vertical_hifisystem', 0.095),
 ('product_vertical_hifisystem_lag1', -0.023),
 ('product_vertical_hifisystem_lag2', 0.007),
 ('product_vertical_hifisystem_lag3', -0.02),
 ('product_vertical_homeaudiospeaker', 0.136),
 ('product_vertical_homeaudiospeaker_lag1', -0.018),
 ('product_vertical_homeaudiospeaker_lag2', 0.004),
 ('product_vertical_homeaudiospeaker_lag3', -0.004),
 ('product_vertical_karaokeplayer', 0.0),
 ('product_vertical_karaokeplayer_lag1', 0.0),
 ('product_vertical_karaokeplayer_lag2', 0.0),
 ('product_vertical_karaokeplayer_lag3', 0.0),
 ('product_vertical_slingbox', -0.003),
 ('product_vertical_slingbox_lag1', -0.0),
 ('product_vertical_slingbox_lag2', 0.001),
 ('product_vertical_slingbox_lag3', -0.017),
 ('product_vertical_soundmixer', 0.003),
 ('product_vertical_soundmixer_lag1', 0.011),
 ('product_vertical_soundmixer_lag2', 0.021),
 ('product_vertical_soundmixer_lag3', -0.008),
 ('product_vertical_voicerecorder', 0.104),
 ('product_vertical_voicerecorder_lag1', -0.004),
 ('product_vertical_voicerecorder_lag2', 0.007),
 ('product_vertical_voicerecorder_lag3', -0.0),
 ('payday_week', 0.0),
 ('payday_week_lag1', 0.0),
 ('payday_week_lag2', 0.0),
 ('payday_week_lag3', 0.0),
 ('holiday_week', 0.0),
 ('holiday_week_lag1', 0.0),
 ('holiday_week_lag2', 0.0),
 ('holiday_week_lag3', 0.0),
 ('Total Investment', 0.02),
 ('Total Investment_lag1', -0.003),
 ('Total Investment_lag2', 0.004),
 ('Total Investment_lag3', 0.002),
 ('Total Investment_SMA_3', -0.001),
 ('Total Investment_SMA_3_lag1', 0.003),
 ('Total Investment_SMA_3_lag2', -0.016),
 ('Total Investment_SMA_3_lag3', 0.014),
 ('Total Investment_SMA_5', -0.021),
 ('Total Investment_SMA_5_lag1', 0.013),
 ('Total Investment_SMA_5_lag2', 0.006),
 ('Total Investment_SMA_5_lag3', 0.006),
 ('Total Investment_EMA_8', 0.001),
 ('Total Investment_EMA_8_lag1', 0.008),
 ('Total Investment_EMA_8_lag2', 0.002),
 ('Total Investment_EMA_8_lag3', 0.004),
 ('Total_Investment_Ad_Stock', 0.002),
 ('Total_Investment_Ad_Stock_lag1', 0.004),
 ('Total_Investment_Ad_Stock_lag2', 0.0),
 ('Total_Investment_Ad_Stock_lag3', -0.0),
 ('TV', 0.028),
 ('TV_lag1', -0.012),
 ('TV_lag2', -0.003),
 ('TV_lag3', -0.003),
 ('TV_SMA_3', -0.011),
 ('TV_SMA_3_lag1', 0.002),
 ('TV_SMA_3_lag2', 0.021),
 ('TV_SMA_3_lag3', -0.013),
 ('TV_SMA_5', 0.016),
 ('TV_SMA_5_lag1', -0.012),
 ('TV_SMA_5_lag2', -0.009),
 ('TV_SMA_5_lag3', -0.006),
 ('TV_EMA_8', -0.001),
 ('TV_EMA_8_lag1', -0.001),
 ('TV_EMA_8_lag2', -0.0),
 ('TV_EMA_8_lag3', 0.002),
 ('TV_Ad_Stock', -0.003),
 ('TV_Ad_Stock_lag1', -0.002),
 ('TV_Ad_Stock_lag2', -0.005),
 ('TV_Ad_Stock_lag3', -0.004),
 ('Digital', 0.003),
 ('Digital_lag1', 0.011),
 ('Digital_lag2', 0.001),
 ('Digital_lag3', -0.007),
 ('Digital_SMA_3', -0.002),
 ('Digital_SMA_3_lag1', 0.01),
 ('Digital_SMA_3_lag2', 0.002),
 ('Digital_SMA_3_lag3', -0.008),
 ('Digital_SMA_5', 0.0),
 ('Digital_SMA_5_lag1', 0.002),
 ('Digital_SMA_5_lag2', -0.002),
 ('Digital_SMA_5_lag3', -0.016),
 ('Digital_EMA_8', -0.008),
 ('Digital_EMA_8_lag1', 0.01),
 ('Digital_EMA_8_lag2', -0.01),
 ('Digital_EMA_8_lag3', -0.008),
 ('Digital_Ad_Stock', -0.007),
 ('Digital_Ad_Stock_lag1', 0.014),
 ('Digital_Ad_Stock_lag2', -0.01),
 ('Digital_Ad_Stock_lag3', -0.014),
 ('Sponsorship', 0.024),
 ('Sponsorship_lag1', -0.016),
 ('Sponsorship_lag2', 0.002),
 ('Sponsorship_lag3', 0.001),
 ('Sponsorship_SMA_3', -0.008),
 ('Sponsorship_SMA_3_lag1', -0.002),
 ('Sponsorship_SMA_3_lag2', -0.006),
 ('Sponsorship_SMA_3_lag3', 0.012),
 ('Sponsorship_SMA_5', -0.017),
 ('Sponsorship_SMA_5_lag1', 0.006),
 ('Sponsorship_SMA_5_lag2', 0.009),
 ('Sponsorship_SMA_5_lag3', 0.004),
 ('Sponsorship_EMA_8', -0.001),
 ('Sponsorship_EMA_8_lag1', 0.004),
 ('Sponsorship_EMA_8_lag2', 0.003),
 ('Sponsorship_EMA_8_lag3', 0.006),
 ('Sponsorship_Ad_Stock', -0.002),
 ('Sponsorship_Ad_Stock_lag1', -0.001),
 ('Sponsorship_Ad_Stock_lag2', 0.001),
 ('Sponsorship_Ad_Stock_lag3', 0.001),
 ('Content Marketing', -0.007),
 ('Content Marketing_lag1', 0.0),
 ('Content Marketing_lag2', 0.006),
 ('Content Marketing_lag3', 0.0),
 ('Content Marketing_SMA_3', -0.001),
 ('Content Marketing_SMA_3_lag1', 0.005),
 ('Content Marketing_SMA_3_lag2', 0.007),
 ('Content Marketing_SMA_3_lag3', -0.001),
 ('Content Marketing_SMA_5', 0.003),
 ('Content Marketing_SMA_5_lag1', 0.003),
 ('Content Marketing_SMA_5_lag2', 0.013),
 ('Content Marketing_SMA_5_lag3', -0.006),
 ('Content Marketing_EMA_8', -0.004),
 ('Content Marketing_EMA_8_lag1', 0.004),
 ('Content Marketing_EMA_8_lag2', -0.001),
 ('Content Marketing_EMA_8_lag3', -0.002),
 ('Content_Marketing_Ad_Stock', -0.003),
 ('Content_Marketing_Ad_Stock_lag1', 0.011),
 ('Content_Marketing_Ad_Stock_lag2', 0.004),
 ('Content_Marketing_Ad_Stock_lag3', -0.002),
 ('Online marketing', 0.025),
 ('Online marketing_lag1', -0.006),
 ('Online marketing_lag2', -0.002),
 ('Online marketing_lag3', 0.0),
 ('Online marketing_SMA_3', -0.005),
 ('Online marketing_SMA_3_lag1', 0.004),
 ('Online marketing_SMA_3_lag2', 0.011),
 ('Online marketing_SMA_3_lag3', -0.001),
 ('Online marketing_SMA_5', 0.007),
 ('Online marketing_SMA_5_lag1', 0.001),
 ('Online marketing_SMA_5_lag2', -0.0),
 ('Online marketing_SMA_5_lag3', 0.003),
 ('Online marketing_EMA_8', 0.002),
 ('Online marketing_EMA_8_lag1', 0.004),
 ('Online marketing_EMA_8_lag2', 0.002),
 ('Online marketing_EMA_8_lag3', 0.004),
 ('Online_marketing_Ad_Stock', 0.001),
 ('Online_marketing_Ad_Stock_lag1', 0.002),
 ('Online_marketing_Ad_Stock_lag2', -0.001),
 ('Online_marketing_Ad_Stock_lag3', -0.001),
 ('Affiliates', 0.024),
 ('Affiliates_lag1', -0.007),
 ('Affiliates_lag2', -0.002),
 ('Affiliates_lag3', -0.0),
 ('Affiliates_SMA_3', -0.004),
 ('Affiliates_SMA_3_lag1', 0.003),
 ('Affiliates_SMA_3_lag2', 0.024),
 ('Affiliates_SMA_3_lag3', -0.008),
 ('Affiliates_SMA_5', 0.023),
 ('Affiliates_SMA_5_lag1', -0.008),
 ('Affiliates_SMA_5_lag2', -0.002),
 ('Affiliates_SMA_5_lag3', 0.001),
 ('Affiliates_EMA_8', 0.002),
 ('Affiliates_EMA_8_lag1', 0.002),
 ('Affiliates_EMA_8_lag2', 0.002),
 ('Affiliates_EMA_8_lag3', 0.003),
 ('Affiliates_Ad_Stock', 0.001),
 ('Affiliates_Ad_Stock_lag1', -0.0),
 ('Affiliates_Ad_Stock_lag2', -0.001),
 ('Affiliates_Ad_Stock_lag3', -0.002),
 ('SEM', 0.007),
 ('SEM_lag1', 0.012),
 ('SEM_lag2', 0.005),
 ('SEM_lag3', -0.001),
 ('SEM_SMA_3', 0.0),
 ('SEM_SMA_3_lag1', 0.011),
 ('SEM_SMA_3_lag2', -0.003),
 ('SEM_SMA_3_lag3', 0.009),
 ('SEM_SMA_5', -0.008),
 ('SEM_SMA_5_lag1', 0.014),
 ('SEM_SMA_5_lag2', 0.009),
 ('SEM_SMA_5_lag3', -0.0),
 ('SEM_EMA_8', -0.002),
 ('SEM_EMA_8_lag1', 0.015),
 ('SEM_EMA_8_lag2', -0.002),
 ('SEM_EMA_8_lag3', 0.003),
 ('SEM_Ad_Stock', -0.002),
 ('SEM_Ad_Stock_lag1', 0.014),
 ('SEM_Ad_Stock_lag2', -0.003),
 ('SEM_Ad_Stock_lag3', -0.004),
 ('Radio', 0.01),
 ('Radio_lag1', 0.004),
 ('Radio_lag2', 0.013),
 ('Radio_lag3', -0.012),
 ('Radio_SMA_3', -0.01),
 ('Radio_SMA_3_lag1', 0.01),
 ('Radio_SMA_3_lag2', 0.001),
 ('Radio_SMA_3_lag3', 0.004),
 ('Radio_SMA_5', 0.012),
 ('Radio_SMA_5_lag1', 0.009),
 ('Radio_SMA_5_lag2', -0.014),
 ('Radio_SMA_5_lag3', 0.036),
 ('Radio_EMA_8', -0.004),
 ('Radio_EMA_8_lag1', -0.007),
 ('Radio_EMA_8_lag2', -0.007),
 ('Radio_EMA_8_lag3', -0.002),
 ('Radio_Ad_Stock', -0.002),
 ('Radio_Ad_Stock_lag1', 0.0),
 ('Radio_Ad_Stock_lag2', -0.003),
 ('Radio_Ad_Stock_lag3', 0.004),
 ('Other', -0.004),
 ('Other_lag1', 0.006),
 ('Other_lag2', 0.005),
 ('Other_lag3', 0.003),
 ('Other_SMA_3', 0.004),
 ('Other_SMA_3_lag1', -0.013),
 ('Other_SMA_3_lag2', 0.001),
 ('Other_SMA_3_lag3', 0.005),
 ('Other_SMA_5', -0.003),
 ('Other_SMA_5_lag1', -0.004),
 ('Other_SMA_5_lag2', -0.0),
 ('Other_SMA_5_lag3', 0.003),
 ('Other_EMA_8', -0.005),
 ('Other_EMA_8_lag1', -0.002),
 ('Other_EMA_8_lag2', -0.004),
 ('Other_EMA_8_lag3', -0.001),
 ('Other_Ad_Stock', -0.004),
 ('Other_Ad_Stock_lag1', 0.004),
 ('Other_Ad_Stock_lag2', 0.001),
 ('Other_Ad_Stock_lag3', 0.006),
 ('NPS', -0.015),
 ('NPS_lag1', 0.009),
 ('NPS_lag2', -0.001),
 ('NPS_lag3', 0.001),
 ('NPS_SMA_3', -0.001),
 ('NPS_SMA_3_lag1', 0.001),
 ('NPS_SMA_3_lag2', -0.044),
 ('NPS_SMA_3_lag3', 0.019),
 ('NPS_SMA_5', -0.044),
 ('NPS_SMA_5_lag1', 0.02),
 ('NPS_SMA_5_lag2', 0.003),
 ('NPS_SMA_5_lag3', 0.006),
 ('Stock Index', -0.009),
 ('Stock Index_lag1', 0.011),
 ('Stock Index_lag2', -0.002),
 ('Stock Index_lag3', 0.001),
 ('Stock Index_SMA_3', -0.001),
 ('Stock Index_SMA_3_lag1', 0.002),
 ('Stock Index_SMA_3_lag2', -0.043),
 ('Stock Index_SMA_3_lag3', 0.019),
 ('Stock Index_SMA_5', -0.043),
 ('Stock Index_SMA_5_lag1', 0.02),
 ('Stock Index_SMA_5_lag2', 0.004),
 ('Stock Index_SMA_5_lag3', 0.006),
 ('Max Temp', -0.0),
 ('Max Temp_lag1', 0.003),
 ('Max Temp_lag2', 0.001),
 ('Max Temp_lag3', -0.001),
 ('Min Temp', -0.006),
 ('Min Temp_lag1', 0.011),
 ('Min Temp_lag2', -0.005),
 ('Min Temp_lag3', 0.013),
 ('Mean Temp', 0.019),
 ('Mean Temp_lag1', -0.011),
 ('Mean Temp_lag2', -0.008),
 ('Mean Temp_lag3', 0.003),
 ('Heat Deg Days', 0.009),
 ('Heat Deg Days_lag1', -0.014),
 ('Heat Deg Days_lag2', 0.004),
 ('Heat Deg Days_lag3', -0.025),
 ('Cool Deg Days', 0.002),
 ('Cool Deg Days_lag1', -0.018),
 ('Cool Deg Days_lag2', 0.01),
 ('Cool Deg Days_lag3', 0.005),
 ('Total Rain (mm)', 0.016),
 ('Total Rain (mm)_lag1', 0.003),
 ('Total Rain (mm)_lag2', 0.005),
 ('Total Rain (mm)_lag3', 0.004),
 ('Total Snow (cm)', -0.006),
 ('Total Snow (cm)_lag1', -0.005),
 ('Total Snow (cm)_lag2', 0.009),
 ('Total Snow (cm)_lag3', 0.006),
 ('Total Precip (mm)', 0.013),
 ('Total Precip (mm)_lag1', 0.001),
 ('Total Precip (mm)_lag2', 0.009),
 ('Total Precip (mm)_lag3', -0.002),
 ('Snow on Grnd (cm)', 0.004),
 ('Snow on Grnd (cm)_lag1', 0.01),
 ('Snow on Grnd (cm)_lag2', 0.004),
 ('Snow on Grnd (cm)_lag3', 0.007),
 ('Sale', 0.024),
 ('Sale_lag1', -0.007),
 ('Sale_lag2', 0.004),
 ('Sale_lag3', -0.018)]
In [403]:
homeaudio_lr_coef_df = pd.DataFrame(homeaudio_lr_coef)
col_rename = {0:'Features',1: 'Coefficients'}
homeaudio_lr_coef_df = homeaudio_lr_coef_df.rename(columns=col_rename)
homeaudio_lr_coef_df = homeaudio_lr_coef_df.iloc[1:,:]
homeaudio_lr_coef_df = homeaudio_lr_coef_df.loc[homeaudio_lr_coef_df['Coefficients']!=0.0]
homeaudio_lr_coef_df = homeaudio_lr_coef_df.sort_values(by=['Coefficients'], ascending = False)
homeaudio_lr_coef_df
Out[403]:
Features Coefficients
52 product_vertical_homeaudiospeaker 0.136
28 is_mass_market 0.133
44 product_vertical_fmradio 0.122
24 is_cod 0.112
68 product_vertical_voicerecorder 0.104
4 Discount% 0.101
48 product_vertical_hifisystem 0.095
36 product_vertical_dock 0.060
20 product_procurement_sla 0.039
41 product_vertical_dockingstation_lag1 0.038
32 product_vertical_djcontroller 0.036
251 Radio_SMA_5_lag3 0.036
100 TV 0.028
180 Online marketing 0.025
206 Affiliates_SMA_3_lag2 0.024
340 Sale 0.024
200 Affiliates 0.024
140 Sponsorship 0.024
208 Affiliates_SMA_5 0.023
106 TV_SMA_3_lag2 0.021
66 product_vertical_soundmixer_lag2 0.021
301 Stock Index_SMA_5_lag1 0.020
289 NPS_SMA_5_lag1 0.020
80 Total Investment 0.020
312 Mean Temp 0.019
287 NPS_SMA_3_lag3 0.019
299 Stock Index_SMA_3_lag3 0.019
13 deliverycdays_lag1 0.017
324 Total Rain (mm) 0.016
9 deliverybdays_lag1 0.016
108 TV_SMA_5 0.016
233 SEM_EMA_8_lag1 0.015
229 SEM_SMA_5_lag1 0.014
237 SEM_Ad_Stock_lag1 0.014
21 product_procurement_sla_lag1 0.014
137 Digital_Ad_Stock_lag1 0.014
87 Total Investment_SMA_3_lag3 0.014
311 Min Temp_lag3 0.013
332 Total Precip (mm) 0.013
89 Total Investment_SMA_5_lag1 0.013
170 Content Marketing_SMA_5_lag2 0.013
242 Radio_lag2 0.013
23 product_procurement_sla_lag3 0.012
248 Radio_SMA_5 0.012
147 Sponsorship_SMA_3_lag3 0.012
221 SEM_lag1 0.012
186 Online marketing_SMA_3_lag2 0.011
309 Min Temp_lag1 0.011
177 Content_Marketing_Ad_Stock_lag1 0.011
65 product_vertical_soundmixer_lag1 0.011
121 Digital_lag1 0.011
293 Stock Index_lag1 0.011
225 SEM_SMA_3_lag1 0.011
43 product_vertical_dockingstation_lag3 0.011
322 Cool Deg Days_lag2 0.010
133 Digital_EMA_8_lag1 0.010
337 Snow on Grnd (cm)_lag1 0.010
245 Radio_SMA_3_lag1 0.010
125 Digital_SMA_3_lag1 0.010
240 Radio 0.010
249 Radio_SMA_5_lag1 0.009
281 NPS_lag1 0.009
227 SEM_SMA_3_lag3 0.009
334 Total Precip (mm)_lag2 0.009
316 Heat Deg Days 0.009
330 Total Snow (cm)_lag2 0.009
230 SEM_SMA_5_lag2 0.009
150 Sponsorship_SMA_5_lag2 0.009
17 sla_lag1 0.008
93 Total Investment_EMA_8_lag1 0.008
166 Content Marketing_SMA_3_lag2 0.007
70 product_vertical_voicerecorder_lag2 0.007
188 Online marketing_SMA_5 0.007
220 SEM 0.007
50 product_vertical_hifisystem_lag2 0.007
339 Snow on Grnd (cm)_lag3 0.007
303 Stock Index_SMA_5_lag3 0.006
162 Content Marketing_lag2 0.006
26 is_cod_lag2 0.006
149 Sponsorship_SMA_5_lag1 0.006
155 Sponsorship_EMA_8_lag3 0.006
90 Total Investment_SMA_5_lag2 0.006
91 Total Investment_SMA_5_lag3 0.006
331 Total Snow (cm)_lag3 0.006
291 NPS_SMA_5_lag3 0.006
261 Other_lag1 0.006
279 Other_Ad_Stock_lag3 0.006
323 Cool Deg Days_lag3 0.005
262 Other_lag2 0.005
267 Other_SMA_3_lag3 0.005
222 SEM_lag2 0.005
326 Total Rain (mm)_lag2 0.005
165 Content Marketing_SMA_3_lag1 0.005
38 product_vertical_dock_lag2 0.005
247 Radio_SMA_3_lag3 0.004
151 Sponsorship_SMA_5_lag3 0.004
153 Sponsorship_EMA_8_lag1 0.004
173 Content Marketing_EMA_8_lag1 0.004
259 Radio_Ad_Stock_lag3 0.004
185 Online marketing_SMA_3_lag1 0.004
277 Other_Ad_Stock_lag1 0.004
193 Online marketing_EMA_8_lag1 0.004
264 Other_SMA_3 0.004
195 Online marketing_EMA_8_lag3 0.004
302 Stock Index_SMA_5_lag2 0.004
241 Radio_lag1 0.004
178 Content_Marketing_Ad_Stock_lag2 0.004
327 Total Rain (mm)_lag3 0.004
97 Total_Investment_Ad_Stock_lag1 0.004
54 product_vertical_homeaudiospeaker_lag2 0.004
342 Sale_lag2 0.004
338 Snow on Grnd (cm)_lag2 0.004
82 Total Investment_lag2 0.004
336 Snow on Grnd (cm) 0.004
95 Total Investment_EMA_8_lag3 0.004
318 Heat Deg Days_lag2 0.004
64 product_vertical_soundmixer 0.003
169 Content Marketing_SMA_5_lag1 0.003
315 Mean Temp_lag3 0.003
154 Sponsorship_EMA_8_lag2 0.003
85 Total Investment_SMA_3_lag1 0.003
191 Online marketing_SMA_5_lag3 0.003
7 Discount%_lag3 0.003
290 NPS_SMA_5_lag2 0.003
205 Affiliates_SMA_3_lag1 0.003
271 Other_SMA_5_lag3 0.003
325 Total Rain (mm)_lag1 0.003
235 SEM_EMA_8_lag3 0.003
120 Digital 0.003
215 Affiliates_EMA_8_lag3 0.003
305 Max Temp_lag1 0.003
168 Content Marketing_SMA_5 0.003
263 Other_lag3 0.003
212 Affiliates_EMA_8 0.002
83 Total Investment_lag3 0.002
320 Cool Deg Days 0.002
214 Affiliates_EMA_8_lag2 0.002
18 sla_lag2 0.002
192 Online marketing_EMA_8 0.002
30 is_mass_market_lag2 0.002
194 Online marketing_EMA_8_lag2 0.002
94 Total Investment_EMA_8_lag2 0.002
126 Digital_SMA_3_lag2 0.002
213 Affiliates_EMA_8_lag1 0.002
197 Online_marketing_Ad_Stock_lag1 0.002
96 Total_Investment_Ad_Stock 0.002
129 Digital_SMA_5_lag1 0.002
142 Sponsorship_lag2 0.002
297 Stock Index_SMA_3_lag1 0.002
115 TV_EMA_8_lag3 0.002
105 TV_SMA_3_lag1 0.002
5 Discount%_lag1 0.001
306 Max Temp_lag2 0.001
143 Sponsorship_lag3 0.001
333 Total Precip (mm)_lag1 0.001
62 product_vertical_slingbox_lag2 0.001
211 Affiliates_SMA_5_lag3 0.001
278 Other_Ad_Stock_lag2 0.001
295 Stock Index_lag3 0.001
266 Other_SMA_3_lag2 0.001
216 Affiliates_Ad_Stock 0.001
122 Digital_lag2 0.001
46 product_vertical_fmradio_lag2 0.001
189 Online marketing_SMA_5_lag1 0.001
196 Online_marketing_Ad_Stock 0.001
2 gmv_lag2 0.001
92 Total Investment_EMA_8 0.001
33 product_vertical_djcontroller_lag1 0.001
283 NPS_lag3 0.001
246 Radio_SMA_3_lag2 0.001
158 Sponsorship_Ad_Stock_lag2 0.001
159 Sponsorship_Ad_Stock_lag3 0.001
285 NPS_SMA_3_lag1 0.001
218 Affiliates_Ad_Stock_lag2 -0.001
284 NPS_SMA_3 -0.001
6 Discount%_lag2 -0.001
296 Stock Index_SMA_3 -0.001
174 Content Marketing_EMA_8_lag2 -0.001
223 SEM_lag3 -0.001
164 Content Marketing_SMA_3 -0.001
157 Sponsorship_Ad_Stock_lag1 -0.001
152 Sponsorship_EMA_8 -0.001
187 Online marketing_SMA_3_lag3 -0.001
275 Other_EMA_8_lag3 -0.001
84 Total Investment_SMA_3 -0.001
199 Online_marketing_Ad_Stock_lag3 -0.001
198 Online_marketing_Ad_Stock_lag2 -0.001
113 TV_EMA_8_lag1 -0.001
167 Content Marketing_SMA_3_lag3 -0.001
307 Max Temp_lag3 -0.001
112 TV_EMA_8 -0.001
282 NPS_lag2 -0.001
234 SEM_EMA_8_lag2 -0.002
232 SEM_EMA_8 -0.002
8 deliverybdays -0.002
255 Radio_EMA_8_lag3 -0.002
3 gmv_lag3 -0.002
273 Other_EMA_8_lag1 -0.002
335 Total Precip (mm)_lag3 -0.002
236 SEM_Ad_Stock -0.002
256 Radio_Ad_Stock -0.002
12 deliverycdays -0.002
124 Digital_SMA_3 -0.002
175 Content Marketing_EMA_8_lag3 -0.002
182 Online marketing_lag2 -0.002
130 Digital_SMA_5_lag2 -0.002
117 TV_Ad_Stock_lag1 -0.002
294 Stock Index_lag2 -0.002
219 Affiliates_Ad_Stock_lag3 -0.002
210 Affiliates_SMA_5_lag2 -0.002
145 Sponsorship_SMA_3_lag1 -0.002
202 Affiliates_lag2 -0.002
156 Sponsorship_Ad_Stock -0.002
179 Content_Marketing_Ad_Stock_lag3 -0.002
116 TV_Ad_Stock -0.003
258 Radio_Ad_Stock_lag2 -0.003
103 TV_lag3 -0.003
102 TV_lag2 -0.003
176 Content_Marketing_Ad_Stock -0.003
268 Other_SMA_5 -0.003
81 Total Investment_lag1 -0.003
226 SEM_SMA_3_lag2 -0.003
60 product_vertical_slingbox -0.003
27 is_cod_lag3 -0.003
238 SEM_Ad_Stock_lag2 -0.003
260 Other -0.004
119 TV_Ad_Stock_lag3 -0.004
31 is_mass_market_lag3 -0.004
55 product_vertical_homeaudiospeaker_lag3 -0.004
269 Other_SMA_5_lag1 -0.004
239 SEM_Ad_Stock_lag3 -0.004
69 product_vertical_voicerecorder_lag1 -0.004
204 Affiliates_SMA_3 -0.004
276 Other_Ad_Stock -0.004
252 Radio_EMA_8 -0.004
172 Content Marketing_EMA_8 -0.004
274 Other_EMA_8_lag2 -0.004
40 product_vertical_dockingstation -0.004
310 Min Temp_lag2 -0.005
272 Other_EMA_8 -0.005
329 Total Snow (cm)_lag1 -0.005
1 gmv_lag1 -0.005
184 Online marketing_SMA_3 -0.005
47 product_vertical_fmradio_lag3 -0.005
19 sla_lag3 -0.005
39 product_vertical_dock_lag3 -0.005
118 TV_Ad_Stock_lag2 -0.005
14 deliverycdays_lag2 -0.005
308 Min Temp -0.006
146 Sponsorship_SMA_3_lag2 -0.006
181 Online marketing_lag1 -0.006
171 Content Marketing_SMA_5_lag3 -0.006
111 TV_SMA_5_lag3 -0.006
328 Total Snow (cm) -0.006
10 deliverybdays_lag2 -0.006
34 product_vertical_djcontroller_lag2 -0.006
201 Affiliates_lag1 -0.007
341 Sale_lag1 -0.007
160 Content Marketing -0.007
11 deliverybdays_lag3 -0.007
254 Radio_EMA_8_lag2 -0.007
123 Digital_lag3 -0.007
253 Radio_EMA_8_lag1 -0.007
136 Digital_Ad_Stock -0.007
132 Digital_EMA_8 -0.008
314 Mean Temp_lag2 -0.008
127 Digital_SMA_3_lag3 -0.008
35 product_vertical_djcontroller_lag3 -0.008
67 product_vertical_soundmixer_lag3 -0.008
135 Digital_EMA_8_lag3 -0.008
144 Sponsorship_SMA_3 -0.008
207 Affiliates_SMA_3_lag3 -0.008
209 Affiliates_SMA_5_lag1 -0.008
228 SEM_SMA_5 -0.008
292 Stock Index -0.009
15 deliverycdays_lag3 -0.009
110 TV_SMA_5_lag2 -0.009
138 Digital_Ad_Stock_lag2 -0.010
134 Digital_EMA_8_lag2 -0.010
244 Radio_SMA_3 -0.010
313 Mean Temp_lag1 -0.011
45 product_vertical_fmradio_lag1 -0.011
25 is_cod_lag1 -0.011
104 TV_SMA_3 -0.011
37 product_vertical_dock_lag1 -0.011
101 TV_lag1 -0.012
243 Radio_lag3 -0.012
22 product_procurement_sla_lag2 -0.012
109 TV_SMA_5_lag1 -0.012
265 Other_SMA_3_lag1 -0.013
107 TV_SMA_3_lag3 -0.013
250 Radio_SMA_5_lag2 -0.014
317 Heat Deg Days_lag1 -0.014
139 Digital_Ad_Stock_lag3 -0.014
29 is_mass_market_lag1 -0.015
280 NPS -0.015
141 Sponsorship_lag1 -0.016
42 product_vertical_dockingstation_lag2 -0.016
86 Total Investment_SMA_3_lag2 -0.016
131 Digital_SMA_5_lag3 -0.016
63 product_vertical_slingbox_lag3 -0.017
148 Sponsorship_SMA_5 -0.017
343 Sale_lag3 -0.018
321 Cool Deg Days_lag1 -0.018
53 product_vertical_homeaudiospeaker_lag1 -0.018
51 product_vertical_hifisystem_lag3 -0.020
88 Total Investment_SMA_5 -0.021
49 product_vertical_hifisystem_lag1 -0.023
319 Heat Deg Days_lag3 -0.025
298 Stock Index_SMA_3_lag2 -0.043
300 Stock Index_SMA_5 -0.043
288 NPS_SMA_5 -0.044
286 NPS_SMA_3_lag2 -0.044
16 sla -0.050

Plotting the Features in descending order of Importance for homeaudio

In [404]:
# Slightly alter the figure size to make it more horizontal.
plt.figure(figsize=(10, 35), dpi=100, facecolor='w', edgecolor='k', frameon='True')
sns.barplot(y='Features', x='Coefficients', palette='husl', data=homeaudio_lr_coef_df, estimator=np.sum)
# Automatically adjust subplot params so that the subplotS fits in to the figure area.
plt.tight_layout()

# display the plot
plt.show()
The 5 most important features affecting GMV(Revenue) for homeaudio are:
Features Coefficients
product_vertical_homeaudiospeaker 0.136
is_mass_market 0.133
product_vertical_fmradio 0.122
is_cod 0.112
product_vertical_voicerecorder 0.104

Model Dashboard

Product Sub-category Linear Regression Model Cross Validation R-square on Test Dataset Mean Square Error
__cameraaccessory__ Additive No 0.83 0.17
Yes -0.8 1.08
Multiplicative No 0.84 0.36
Yes 0.91 0.09
Koyck No 0.84 0.16
Yes 0.27 0.73
Distributive Lag Model (Additive) No 0.87 0.12
Yes 0.82 0.17
Distributive Lag Model (Multiplicaitive) No 0.77 0.50
Yes 0.82 0.18
__gamingaccessory__ Additive No 0.93 0.05
Yes 0.51 0.49
Multiplicative No 0.94 0.09
Yes 0.94 0.06
Koyck No 0.93 0.05
Yes 0.49 0.51
Distributive Lag Model (Additive) No 0.87 0.10
Yes 0.92 0.08
Distributive Lag Model (Multiplicaitive) No 0.93 0.11
Yes 0.89 0.11
__homeaudio__ Additive No 0.96 0.09
Yes 0.73 0.27
Multiplicative No -0.63 0.34
Yes 0.86 0.14
Koyck No 0.96 0.09
Yes 0.70 0.30
Distributive Lag Model (Additive) No 0.42 1.39
Yes 0.55 0.45
Distributive Lag Model (Multiplicaitive) No -0.23 0.26
Yes 0.57 0.43

Model Selection

The criteria of choosing the model is based on the accuracy parameters -- R2 score & MSE score -- and the business relevance of the important attributes chosen by the model. Also we tried to choose models with cross validation because even though the ones without, sometimes gives us good scores, they are not very dependable & generalisable, owing to limited dataset.

By referring the above model dashboard, we finalize the following models for the 3 mentioned product subcategories - Camera Accessory, Gaming Accessory & Home Audio:

Product Sub-category Linear Regression Model R-square on Test Dataset Mean Square Error Top 5 KPIs
__cameraaccessory__ Multiplicative with CV 0.91 0.09 product_vertical_lens (__0.181__)
product_vertical_camerabattery (__0.160__)
is_mass_market (__0.149__)
product_vertical_camerabatterycharger (__0.121__)
TV (__0.105__)
__gamingaccessory__ Multiplicative with CV 0.94 0.06 product_vertical_gamingheadset (__0.250__)
is_mass_market (__0.234__)
product_vertical_gamingmouse (__0.224__)
product_vertical_gamepad (__0.211__)
Online marketing_SMA_3 (__0.157__)
__cameraaccessory__ Multiplicative with CV 0.86 0.14 product_vertical_homeaudiospeaker (__0.469__)
is_mass_market (__0.289__)
product_vertical_fmradio (__0.224__)
Radio_Ad_Stock (__0.147__)
Sponsorship (__0.121__)

We notice that all the 3 chosen models for the 3 sub-categories are Multiplicative models. This fact tells us that there exists some interaction between the KPIs. These models tell us about the growth of revenue vs the interactive growth of the KPIs.

Model Evaluation

Evaluating the Final Models:

In [405]:
# Slightly alter the figure size to make it more horizontal.
plt.figure(figsize=(12, 4), dpi=100, facecolor='w', edgecolor='k', frameon='True')
sns.set_style("white") # white/whitegrid/dark/ticks
sns.set_context("paper") # talk/poster

# subplot 1
plt.subplot(1, 3, 1)
c = [i for i in range(1,51,1)]
plt.plot(c,y_cameraaccessory_mul, color="blue", linewidth=2.5, linestyle="-")
plt.plot(c,cameraaccessory_mul_predictions_cv, color="red",  linewidth=2.5, linestyle="-")
plt.suptitle('Actual vs Predicted', fontsize=16, color = 'c')              # Plot heading 
plt.title('Camera Accessory', fontsize=12)              # Plot heading 
plt.xlabel('Index', fontsize=12)                               # X-label
plt.ylabel('Views', fontsize=12)                               # Y-label

# subplot 2
plt.subplot(1, 3, 2)
c = [i for i in range(1,52,1)]
plt.plot(c,y_gamingaccessory_mul, color="blue", linewidth=2.5, linestyle="-")
plt.plot(c,gamingaccessory_mul_predictions_cv, color="red",  linewidth=2.5, linestyle="-")
plt.title('Gaming Accessory', fontsize=12)
plt.xlabel('Index', fontsize=12)                               # X-label
plt.ylabel('Views', fontsize=12)                               # Y-label

# subplot 3
plt.subplot(1, 3, 3)
c = [i for i in range(1,49,1)]
plt.plot(c,y_homeaudio_mul, color="blue", linewidth=2.5, linestyle="-")
plt.plot(c,homeaudio_mul_predictions_cv, color="red",  linewidth=2.5, linestyle="-")
plt.title('Home Audio', fontsize=12)
plt.xlabel('Index', fontsize=12)                               # X-label
plt.ylabel('Views', fontsize=12)                               # Y-label

# display the plot
plt.show()

Plotting the actual and predicted price values from the dataset to check the likeness

In [406]:
# Slightly alter the figure size to make it more horizontal.
plt.figure(figsize=(12, 4), dpi=100, facecolor='w', edgecolor='k', frameon='True')
sns.set_style("white") # white/whitegrid/dark/ticks
sns.set_context("paper") # talk/poster

# subplot 1
plt.subplot(1, 3, 1)
# Plotting y_test and y_pred to understand the spread.
plt.scatter(y_cameraaccessory_mul,cameraaccessory_mul_predictions_cv)
plt.suptitle('y_actual vs y_predicted', fontsize=16, color = 'c')              # Plot heading 
plt.title('Camera Accessory', fontsize=12)              # Plot heading 
plt.xlabel('y_actual', fontsize=12)                          # X-label
plt.ylabel('y_pred', fontsize=12)                          # Y-label

# subplot 2
plt.subplot(1, 3, 2)
# Plotting y_test and y_pred to understand the spread.
plt.scatter(y_gamingaccessory_mul,gamingaccessory_mul_predictions_cv)
plt.title('Camera Accessory', fontsize=12)              # Plot heading 
plt.xlabel('y_actual', fontsize=12)                          # X-label
plt.ylabel('y_pred', fontsize=12)                          # Y-label

# subplot 3
plt.subplot(1, 3, 3)
# Plotting y_test and y_pred to understand the spread.
plt.scatter(y_homeaudio_mul,homeaudio_mul_predictions_cv)
plt.title('Camera Accessory', fontsize=12)              # Plot heading 
plt.xlabel('y_actual', fontsize=12)                          # X-label
plt.ylabel('y_pred', fontsize=12)                          # Y-label

# display the plot
plt.show()

Drawing a scatter plot with actual and predicted price values from the dataset to check the spread

In [407]:
# Slightly alter the figure size to make it more horizontal.
plt.figure(figsize=(12, 4), dpi=100, facecolor='w', edgecolor='k', frameon='True')
sns.set_style("white") # white/whitegrid/dark/ticks
sns.set_context("paper") # talk/poster

# subplot 1
plt.subplot(1, 3, 1)
# Error terms
c = [i for i in range(1,51,1)]
plt.scatter(c,y_cameraaccessory_mul - cameraaccessory_mul_predictions_cv)
plt.suptitle('Error Terms', fontsize=16, color = 'c')              # Plot heading
plt.title('Camera Accessory', fontsize=12)              # Plot heading 
plt.xlabel('Index', fontsize=12)                      # X-label
plt.ylabel('y_act - y_pred', fontsize=12)                # Y-label

# subplot 2
plt.subplot(1, 3, 2)
# Error terms
c = [i for i in range(1,52,1)]
plt.scatter(c,y_gamingaccessory_mul - gamingaccessory_mul_predictions_cv)
plt.title('Gaming Accessory', fontsize=12)              # Plot heading 
plt.xlabel('Index', fontsize=12)                      # X-label

# subplot 3
plt.subplot(1, 3, 3)
# Error terms
c = [i for i in range(1,49,1)]
plt.scatter(c,y_homeaudio_mul - homeaudio_mul_predictions_cv)
plt.title('Home Audio', fontsize=12)              # Plot heading 
plt.xlabel('Index', fontsize=12)                      # X-label

# display the plot
plt.show()

Drawing a scatter plot of the error terms to check the spread to ensure that the error terms have constant variance (homoscedasticity). The variance doesn't increase (or decrease) or follow a pattern as the error values change.

In [408]:
# Slightly alter the figure size to make it more horizontal.
plt.figure(figsize=(12, 4), dpi=100, facecolor='w', edgecolor='k', frameon='True')
sns.set_style("white") # white/whitegrid/dark/ticks
sns.set_context("paper") # talk/poster

# subplot 1
plt.subplot(1, 3, 1)
# Plot the histogram of the error terms
sns.distplot((y_cameraaccessory_mul-cameraaccessory_mul_predictions_cv), bins = 20)
fig.suptitle('Error Terms', fontsize = 16)                  # Plot heading 
plt.title('Camera Accessory', fontsize=12)              # Plot heading 
plt.xlabel('Errors', fontsize = 12)                         # X-label


# subplot 2
plt.subplot(1, 3, 2)
# Plot the histogram of the error terms
sns.distplot((y_gamingaccessory_mul-gamingaccessory_mul_predictions_cv), bins = 20)
plt.title('Gaming Accessory', fontsize=12)              # Plot heading 
plt.xlabel('Errors', fontsize = 12)                         # X-label

# subplot 3
plt.subplot(1, 3, 3)
# Plot the histogram of the error terms
sns.distplot((y_homeaudio_mul-homeaudio_mul_predictions_cv), bins = 20)
plt.title('Home Audio', fontsize=12)              # Plot heading 
plt.xlabel('Errors', fontsize = 12)                         # X-label

# display the plot
plt.show()

Plotting the distribution of the error terms. The error terms follow a normal distrbution with mean at 0 barring a few outlier values.

In [409]:
# Slightly alter the figure size to make it more horizontal.
plt.figure(figsize=(12, 4), dpi=100, facecolor='w', edgecolor='k', frameon='True')
sns.set_style("white") # white/whitegrid/dark/ticks
sns.set_context("paper") # talk/poster

# subplot 1
plt.subplot(1, 3, 1)
# Plot the histogram of the error terms
sns.regplot(y_cameraaccessory_mul, cameraaccessory_mul_predictions_cv)
plt.suptitle('Best Fitted Line', fontsize=16, color = 'c')              # Plot heading  
plt.title('Camera Accessory', fontsize=12)              # Plot heading 
plt.xlabel('y_actual', fontsize=12)                          # X-label
plt.ylabel('y_pred', fontsize=12)                          # Y-label


# subplot 2
plt.subplot(1, 3, 2)
# Plot the histogram of the error terms
sns.regplot(y_gamingaccessory_mul, gamingaccessory_mul_predictions_cv)
plt.title('Gaming Accessory', fontsize=12)              # Plot heading 
plt.xlabel('y_actual', fontsize=12)                          # X-label
plt.ylabel('y_pred', fontsize=12)                          # Y-label

# subplot 3
plt.subplot(1, 3, 3)
# Plot the histogram of the error terms
sns.regplot(y_homeaudio_mul, homeaudio_mul_predictions_cv)
plt.title('Home Audio', fontsize=12)              # Plot heading 
plt.xlabel('y_actual', fontsize=12)                          # X-label
plt.ylabel('y_pred', fontsize=12)                          # Y-label

# display the plot
plt.show()

Plotting a scatter plot with actual and predicted price values from the dataset to check the spread and drawing the best fitted line through it.

Model Equation

Considering the top 5 KPIs from the models for our 3 product subcategories, we can see that the equation of our best fitted lines as follows:

Camera Accessory:

Revenue = 0.0 + (0.181 × __product_vertical_lens__) + (0.160 × __product_vertical_camerabattery__) + (0.149 × __is_mass_market__) + (0.121 × __product_vertical_camerabatterycharger__) + (0.105 × __TV__)

__Gaming Accessory:__

Revenue = 0.0 + (0.250 × __product_vertical_gamingheadset__) + (0.234 × __is_mass_market__) + (0.224 × __product_vertical_gamingmouse__) + (0.211 × __product_vertical_gamepad__) + (0.157 × __Online marketing_SMA_3__)

__Home Audio:__

Revenue = 0.0 + (0.469 × __product_vertical_homeaudiospeaker__) + (0.289 × __is_mass_market__) + (0.224 × __product_vertical_fmradio__) + (0.147 × __Radio_Ad_Stock__) + (0.121 × __Sponsorship__)

This equation implies how much the Revenue will grow with a unit growth in any of these independent KPIs with all other KPIs held constant.

Recommendation

In [410]:
cameraaccessory_mul_lr_coef_df = cameraaccessory_mul_lr_coef_df.head().append(cameraaccessory_mul_lr_coef_df.tail())
gamingaccessory_mul_lr_coef_df = gamingaccessory_mul_lr_coef_df.head().append(gamingaccessory_mul_lr_coef_df.tail())
homeaudio_mul_lr_coef_df = homeaudio_mul_lr_coef_df.head().append(homeaudio_mul_lr_coef_df.tail())
gamingaccessory_mul_lr_coef_df
Out[410]:
Features Coefficients
13 product_vertical_gamingheadset 0.250
7 is_mass_market 0.234
16 product_vertical_gamingmouse 0.224
9 product_vertical_gamepad 0.211
50 Online marketing_SMA_3 0.157
67 Radio_EMA_8 -0.084
72 Other_EMA_8 -0.084
1 Discount% -0.088
42 Sponsorship_EMA_8 -0.090
15 product_vertical_gamingmemorycard -0.127

Plotting the Top 5 features that affect Each of the 3 Product Sub-categories(both positively and adversely) as per our Chosen Models

In [411]:
# Slightly alter the figure size to make it more horizontal.
plt.figure(figsize=(15,6), dpi=100, facecolor='w', edgecolor='k', frameon='True')

# subplot 1
plt.subplot(1, 3, 1)
sns.barplot(x='Features', y='Coefficients', color = 'tab:red',alpha=0.8, data=cameraaccessory_mul_lr_coef_df, estimator=np.sum)
plt.title('Camera Accessories', fontsize=12, alpha=0.8)
plt.xticks(rotation='vertical', fontsize=10)

# subplot 1
plt.subplot(1, 3, 2)
sns.barplot(x='Features', y='Coefficients', color = 'tab:red',alpha=0.8, data=gamingaccessory_mul_lr_coef_df, estimator=np.sum)
plt.title('Gaming Accessories', fontsize=12,  alpha=0.8)
plt.xticks(rotation='vertical', fontsize=10)

# subplot 1
plt.subplot(1, 3, 3)
sns.barplot(x='Features', y='Coefficients', color = 'tab:red',alpha=0.8, data=homeaudio_mul_lr_coef_df, estimator=np.sum)
plt.title('Home Audio', fontsize=12,  alpha=0.8)
plt.suptitle('Top 5 features that affect Each of the 3 Product Sub-categories(both positively and adversely) \
as per our Chosen Models', fontsize=15, color = 'r', alpha=1)
plt.xticks(rotation='vertical', fontsize=10)

# Automatically adjust subplot params so that the subplotS fits in to the figure area.
#plt.tight_layout()

# display the plot
plt.show()

Recommendations

Camera Accessory:

  • Company should promote Lens, Camera Batteries & Camera Battery Chargers as they fetch the highest revenue.
  • Advertisement spends on TV has a positive impact on revenue. One unit of TV spend can boost the revenue by 0.105 units. Content Marketing spends on the other hand impacts negatively.
  • Mass-market products are better contributors to the increased revenue in comparison to the Luxury products.
  • Higher percentage of Discounts in general given for this sub category works adversely towards bringing down the revenue.

__Gaming Accessory:__

  • Company should promote Gaming Headset, Gaming Mouse & Gamepad as they fetch the highest revenue. On the contrary, Gaming Memory Cards results in loss.
  • Advertisement spends on Online Marketing, Radio & Others have a positive cumulative impact on revenue. Sponsorship spends on the other hand has a negative cumulative effect.
  • Mass-market products are better contributors to the increased revenue in comparison to the Luxury products.
  • Higher percentage of Discounts in general given for this sub category works adversely towards bringing down the revenue.

__Home Audio:__

  • Company should promote Home Audio Speakers & `FM Radios as they fetch the highest revenue.
  • Mass-market products are better contributors to the increased revenue in comparison to the Luxury products.
  • Radio Adstock (carry over effect of Radio Advertisement) spends helps to boost the revenue to a significant extent.
  • Advertisement spends on Sponsorship has a positive impact on revenue. Content Marketing spends on the other hand impacts negatively.
  • COD payments in general for this sub category are bad in bringing down the revenue.

__In General:__

  • Most of the sales take place when Discount% is between 50-60%. However, that doesnt necessarily help in boosting the revenue. EDA shows that an average discount% between 10-20% is the most profitable for the company specially among luxury items.
  • In general most of the Home Audio items sold are luxury items and hence, customers prefer to use COD instead of paying upfront.
  • During festive time(eg. Thanksgiving) more investment is made on Advertisement and good promotional offers were rolled out. This usually boosts the revenue. However just providing discounts without properly adertising for it on several media channels doesn't help. We have seen that for the weeks 32 - 35(August), revenue generated was the lowest from all 3 product subcategories even though median discount% was raised after the initial drought. In fact, this dip in revenue can be observed as a direct relation to minimum amount of total investment in Ads during the given timeframe.